You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "liuzx8888 (via GitHub)" <gi...@apache.org> on 2023/03/29 14:21:08 UTC

[GitHub] [iceberg] liuzx8888 opened a new issue, #7237: Procedures rewrite_data_files not work

liuzx8888 opened a new issue, #7237:
URL: https://github.com/apache/iceberg/issues/7237

   ### Query engine
   
   spark 3.3.2
   
   ### Question
   
   schema table:
   ![image](https://user-images.githubusercontent.com/10862577/228567721-f4d0df39-b9b0-4564-95d6-abdc7d6926a4.png)
   
   execute spark-sql script
   `CALL system.rewrite_data_files(table => 'iceberg_test.hz_jzjbxx1', strategy => 'sort', sort_order => 'zorder(hzxm, hzsfzh, blh, ynkh, ybkh, hzid)');`
   
   ![image](https://user-images.githubusercontent.com/10862577/228568499-eb8cb0cd-ff2c-4f1c-bf86-2cad26c4f2b8.png)
   
   how can run Procedures rewrite_data_files work success??


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on issue #7237: Procedures rewrite_data_files not work

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on issue #7237:
URL: https://github.com/apache/iceberg/issues/7237#issuecomment-1488735040

   What files did you expect to be rewritten? Did you check the docs? For example there are several parameters that effect whether or not a partition will have files rewritten. See
   
   https://iceberg.apache.org/javadoc/1.2.0/org/apache/iceberg/actions/BinPackStrategy.html#field.summary


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] liuzx8888 commented on issue #7237: Procedures rewrite_data_files not work

Posted by "liuzx8888 (via GitHub)" <gi...@apache.org>.
liuzx8888 commented on issue #7237:
URL: https://github.com/apache/iceberg/issues/7237#issuecomment-1488904049

   @RussellSpitzer  Thank you very much,add option it can work


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] liuzx8888 closed issue #7237: Procedures rewrite_data_files not work

Posted by "liuzx8888 (via GitHub)" <gi...@apache.org>.
liuzx8888 closed issue #7237: Procedures rewrite_data_files  not work
URL: https://github.com/apache/iceberg/issues/7237


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] liuzx8888 commented on issue #7237: Procedures rewrite_data_files not work

Posted by "liuzx8888 (via GitHub)" <gi...@apache.org>.
liuzx8888 commented on issue #7237:
URL: https://github.com/apache/iceberg/issues/7237#issuecomment-1488774635

   @RussellSpitzer 
   I came with reference to this document [https://iceberg.apache.org/docs/latest/spark-procedures/](url)
   
   `CALL catalog_name.system.remove_orphan_files(table => 'db.sample', location => 'tablelocation/data')`
   
   ![image](https://user-images.githubusercontent.com/10862577/228577888-bd6bddbf-ad92-4413-b54a-656473e62880.png)
   
   
   My understanding is that after executing this script, the corresponding table file will be sorted according to Zorder to regenerate new file data and metadata, but I looked at the file corresponding to hdfs, the file time has not changed, I am new to Iceberg, maybe there is something I did not understand?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on issue #7237: Procedures rewrite_data_files not work

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on issue #7237:
URL: https://github.com/apache/iceberg/issues/7237#issuecomment-1488787624

   Yes, right below that table is a list of classes and all of their other options. By default, none of the optimize commands will rewrite the entire table. They only touch files which pass their filters, please read the link I added above. 
   
   If you want to rewrite_all, look carefully and you will see that is one of the options.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org