You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "0xffmeta (via GitHub)" <gi...@apache.org> on 2023/02/17 04:04:03 UTC

[GitHub] [iceberg] 0xffmeta opened a new issue, #6865: Cache delete files when reading v2 format with merge-on-read mode

0xffmeta opened a new issue, #6865:
URL: https://github.com/apache/iceberg/issues/6865

   ### Feature Request / Improvement
   
   When we use a flink `upsert` mode to generate the v2 format, we found that the table scan part is very slow due to massive delete files need to be loaded into `DeleteFilter`.
   
   We checked the `FileScanTask` and ovserved that the delete files might be read again and again for the data files within the same partition, e.g. one partition with several commits:
   
   - data1, data2
   - data3, data4, data5 with deleteA, deleteB, deleteC
   - data6 with deleteD
   
   When to scan the eariler data file(e.g. data1, data2), the delete files commited later(e.g. deleteA to deleteD) are needed to be loaded into memory, and for each scan(data1 and data2), those delete files will be loaded each time. 
   And this could slow down the reading performance if there are a lot of writers with frequent commit.
   
   We can introduce a `LoadingCache` in `DeleteFilter` to cache the delete file content. 
   
   
   ### Query engine
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] 0xffmeta commented on issue #6865: Cache delete files when reading v2 format with merge-on-read mode

Posted by "0xffmeta (via GitHub)" <gi...@apache.org>.
0xffmeta commented on issue #6865:
URL: https://github.com/apache/iceberg/issues/6865#issuecomment-1437904012

   cc @rdblue @aokolnychyi if you have some time, maybe you can take a look and check if this makes sense to you. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


Re: [I] Cache delete files when reading v2 format with merge-on-read mode [iceberg]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #6865:
URL: https://github.com/apache/iceberg/issues/6865#issuecomment-1877934826

   This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] commented on issue #6865: Cache delete files when reading v2 format with merge-on-read mode

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #6865:
URL: https://github.com/apache/iceberg/issues/6865#issuecomment-1685435946

   This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


Re: [I] Cache delete files when reading v2 format with merge-on-read mode [iceberg]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed issue #6865: Cache delete files when reading v2 format with merge-on-read mode
URL: https://github.com/apache/iceberg/issues/6865


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org