You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/06/16 15:33:20 UTC
[GitHub] [iceberg] bartosz25 opened a new issue, #5065: The check-ordering purpose
bartosz25 opened a new issue, #5065:
URL: https://github.com/apache/iceberg/issues/5065
Hi,
I'm learning Iceberg and am struggling with the `check-ordering` option. As far as I understood, it
> Checks if input schema and table schema are same
https://iceberg.apache.org/docs/latest/spark-configuration/#write-options
I got the test case from the repo...
https://github.com/apache/iceberg/blob/2531545e3cd3b97494c9e3c137cfe04f4459a9fb/spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/source/TestPartitionValues.java
... and used it to play a bit with the option:
```
sparkSession.sql(
"""
|CREATE OR REPLACE TABLE local.db.letters (
| id STRING NOT NULL,
| letter1 STRING NOT NULL,
| letter2 STRING NOT NULL
|) USING iceberg
|""".stripMargin)
sparkSession.sql("SELECT '2' AS id, 'a' AS letter1, 'A' AS letter2")
.select("letter2", "id" , "letter1") // This is not necessary but I left it for simpler columns ordering and to follow the example from the unit test
.write
.option("check-ordering", "true").insertInto("local.db.letters")
```
I was expecting to see the code failing with the `check-ordering` enabled and the reordered columns, but it succeeded with the position-based insert:
```
+---+-------+-------+
|id |letter1|letter2|
+---+-------+-------+
|A |2 |a |
+---+-------+-------+
```
Thinking it's my local issue (Iceberg 0.13.1, Spark 3.2.0), I cloned the project repo
* extended the aforementioned unit test by
```
spark.read()
.format("iceberg")
.option(SparkReadOptions.VECTORIZATION_ENABLED, String.valueOf(vectorized))
.load(location.toString()).show();
```
* changed the check-ordering flag to true:
```
df.select("data", "id").write()
.format("iceberg")
.mode(SaveMode.Append)
.option(SparkWriteOptions.CHECK_ORDERING, "true")
.save(location.toString());
```
Surprisingly, the print returned correct results but the `check-ordering` flag seems having no effect on the writer. The operation works when it's enabled and disabled.
For sure, I'm missing something. Can you shed some light on it? Is my understanding of this `check-ordering` correct or wrong? If so, do you have any minimal reproducible code showing the insert broken because of the reordered `select(...)` and the `check-ordering` enabled?
Thank you.
Best,
Bartosz.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] github-actions[bot] closed issue #5065: The check-ordering purpose
Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #5065: The check-ordering purpose
URL: https://github.com/apache/iceberg/issues/5065
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] github-actions[bot] commented on issue #5065: The check-ordering purpose
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #5065:
URL: https://github.com/apache/iceberg/issues/5065#issuecomment-1366992248
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] github-actions[bot] commented on issue #5065: The check-ordering purpose
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #5065:
URL: https://github.com/apache/iceberg/issues/5065#issuecomment-1350134025
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org