You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/09/23 04:05:32 UTC
[GitHub] [iceberg] dmgcodevil opened a new issue #3168: How to sort Spark DataFrame
dmgcodevil opened a new issue #3168:
URL: https://github.com/apache/iceberg/issues/3168
I have the following partition spec:
```
timestamp: day
id: bucket-10
```
This is how I sort spark dataFrame:
```
IcebergSpark.registerBucketUDF(sc, "iceberg_bucket10", DataTypes.StringType, 10)
source.sortWithinPartitions(col("int_field"), expr("iceberg_bucket10(id)"))
```
However, the job is failing:
```
java.lang.IllegalStateException: Already closed files for partition: timestamp_day=2021-09-23/id_bucket=0
```
What I'm doing wrong ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] dmgcodevil edited a comment on issue #3168: How to sort Spark DataFrame ?
Posted by GitBox <gi...@apache.org>.
dmgcodevil edited a comment on issue #3168:
URL: https://github.com/apache/iceberg/issues/3168#issuecomment-926128541
However, if I reverse the order of columns, i.e.:
```
source.sortWithinPartitions(expr("iceberg_bucket10(id)"), col("int_field"))
```
it's working
is that expected behavior?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] dmgcodevil commented on issue #3168: How to sort Spark DataFrame ?
Posted by GitBox <gi...@apache.org>.
dmgcodevil commented on issue #3168:
URL: https://github.com/apache/iceberg/issues/3168#issuecomment-926128541
However, if I reverse the order of columns, i.e.:
```
source.sortWithinPartitions(expr("iceberg_bucket10(id)"), col("int_field"))
```
it's working
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org