You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/04/27 16:00:22 UTC

[GitHub] [iceberg] aokolnychyi opened a new issue, #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

aokolnychyi opened a new issue, #7449:
URL: https://github.com/apache/iceberg/issues/7449

   ### Feature Request / Improvement
   
   We should expose data and file sequence numbers on `ContentFile` or `ContentScanTask`.
   
   ### Query engine
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "gaborkaszab (via GitHub)" <gi...@apache.org>.
gaborkaszab commented on issue #7449:
URL: https://github.com/apache/iceberg/issues/7449#issuecomment-1527192469

   Thanks for opening this issue, @aokolnychyi ! Is there any one working on this? If not, I can take a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue commented on issue #7449:
URL: https://github.com/apache/iceberg/issues/7449#issuecomment-1530587489

   @gaborkaszab, can you link your PR here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi closed issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi closed issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask
URL: https://github.com/apache/iceberg/issues/7449


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "gaborkaszab (via GitHub)" <gi...@apache.org>.
gaborkaszab commented on issue #7449:
URL: https://github.com/apache/iceberg/issues/7449#issuecomment-1536402557

   Just an FYI: I have a WiP PR, making some test changes and will submit a PR this weekend or early next week.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on issue #7449:
URL: https://github.com/apache/iceberg/issues/7449#issuecomment-1536689425

   Sounds good, @gaborkaszab! Could you link it here once the PR is up?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on issue #7449:
URL: https://github.com/apache/iceberg/issues/7449#issuecomment-1533448766

   will rebase the JSON parser PR #6934 after this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on issue #7449:
URL: https://github.com/apache/iceberg/issues/7449#issuecomment-1533545412

   This logic is also required for rewriting position deletes. We promised it a long time ago and shouldn't be that hard to implement. I can look into it, probably, this week. Let me know if you start working on it earlier, @gaborkaszab.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on issue #7449:
URL: https://github.com/apache/iceberg/issues/7449#issuecomment-1550111236

   Resolved by PR #7555.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on issue #7449:
URL: https://github.com/apache/iceberg/issues/7449#issuecomment-1528167110

   @gaborkaszab, I took a look at that PR and I believe the primary goal is different. Would you be interested to create a separate PR to expose data and file sequence numbers? I'd try to expose it on `ContentFile` first. If too complicated, we can consider doing this on `ContentScanTask`. I think #5760 contains a reasonable idea of doing this in `InheritableMetadata`. We can add @chenjunjiedada as co-author if you decide to reuse some parts. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "jackye1995 (via GitHub)" <gi...@apache.org>.
jackye1995 commented on issue #7449:
URL: https://github.com/apache/iceberg/issues/7449#issuecomment-1527800112

   assigned to you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "gaborkaszab (via GitHub)" <gi...@apache.org>.
gaborkaszab commented on issue #7449:
URL: https://github.com/apache/iceberg/issues/7449#issuecomment-1541617431

   Removed this from the 1.3.0 milestone to reduce noise, as the PR is also added to the milestone items.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on issue #7449:
URL: https://github.com/apache/iceberg/issues/7449#issuecomment-1528022028

   PR #5760 is relevant. However, it seems to cover more than what we need, so it may be a good idea to just focus on exposing what we currently have. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "gaborkaszab (via GitHub)" <gi...@apache.org>.
gaborkaszab commented on issue #7449:
URL: https://github.com/apache/iceberg/issues/7449#issuecomment-1531453274

   Thanks for taking a look at this! Sure, I can create a separate PR to simply focus on exposing the sequence numbers through ContentFile or if it's not that straightforward then on ContentScanTask. I don't have anything ready atm, though. I see this item is added to 1.3.0. I wonder if this is because someone is expecting this change asap. Frankly, I'll have time to take care of this somewhere around next week so if there is no real time pressure on this, we could push this out with one release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on issue #7449: Expose data and file sequence numbers on ContentFile or ContentScanTask

Posted by "gaborkaszab (via GitHub)" <gi...@apache.org>.
gaborkaszab commented on issue #7449:
URL: https://github.com/apache/iceberg/issues/7449#issuecomment-1534804650

   Good to know there is a need for this :) I started looking into now, will let you know once I have anything to share.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org