You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "liuzx8888 (via GitHub)" <gi...@apache.org> on 2023/03/29 14:21:08 UTC
[GitHub] [iceberg] liuzx8888 opened a new issue, #7237: Procedures rewrite_data_files not work
liuzx8888 opened a new issue, #7237:
URL: https://github.com/apache/iceberg/issues/7237
### Query engine
spark 3.3.2
### Question
schema table:
![image](https://user-images.githubusercontent.com/10862577/228567721-f4d0df39-b9b0-4564-95d6-abdc7d6926a4.png)
execute spark-sql script
`CALL system.rewrite_data_files(table => 'iceberg_test.hz_jzjbxx1', strategy => 'sort', sort_order => 'zorder(hzxm, hzsfzh, blh, ynkh, ybkh, hzid)');`
![image](https://user-images.githubusercontent.com/10862577/228568499-eb8cb0cd-ff2c-4f1c-bf86-2cad26c4f2b8.png)
how can run Procedures rewrite_data_files work success??
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] RussellSpitzer commented on issue #7237: Procedures rewrite_data_files not work
Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on issue #7237:
URL: https://github.com/apache/iceberg/issues/7237#issuecomment-1488735040
What files did you expect to be rewritten? Did you check the docs? For example there are several parameters that effect whether or not a partition will have files rewritten. See
https://iceberg.apache.org/javadoc/1.2.0/org/apache/iceberg/actions/BinPackStrategy.html#field.summary
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] liuzx8888 commented on issue #7237: Procedures rewrite_data_files not work
Posted by "liuzx8888 (via GitHub)" <gi...@apache.org>.
liuzx8888 commented on issue #7237:
URL: https://github.com/apache/iceberg/issues/7237#issuecomment-1488904049
@RussellSpitzer Thank you very much,add option it can work
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] liuzx8888 closed issue #7237: Procedures rewrite_data_files not work
Posted by "liuzx8888 (via GitHub)" <gi...@apache.org>.
liuzx8888 closed issue #7237: Procedures rewrite_data_files not work
URL: https://github.com/apache/iceberg/issues/7237
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] liuzx8888 commented on issue #7237: Procedures rewrite_data_files not work
Posted by "liuzx8888 (via GitHub)" <gi...@apache.org>.
liuzx8888 commented on issue #7237:
URL: https://github.com/apache/iceberg/issues/7237#issuecomment-1488774635
@RussellSpitzer
I came with reference to this document [https://iceberg.apache.org/docs/latest/spark-procedures/](url)
`CALL catalog_name.system.remove_orphan_files(table => 'db.sample', location => 'tablelocation/data')`
![image](https://user-images.githubusercontent.com/10862577/228577888-bd6bddbf-ad92-4413-b54a-656473e62880.png)
My understanding is that after executing this script, the corresponding table file will be sorted according to Zorder to regenerate new file data and metadata, but I looked at the file corresponding to hdfs, the file time has not changed, I am new to Iceberg, maybe there is something I did not understand?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] RussellSpitzer commented on issue #7237: Procedures rewrite_data_files not work
Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on issue #7237:
URL: https://github.com/apache/iceberg/issues/7237#issuecomment-1488787624
Yes, right below that table is a list of classes and all of their other options. By default, none of the optimize commands will rewrite the entire table. They only touch files which pass their filters, please read the link I added above.
If you want to rewrite_all, look carefully and you will see that is one of the options.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org