You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/06/23 18:57:12 UTC

[GitHub] [incubator-pinot] pradeepgv42 opened a new issue #5609: Organize segments into time based folders when using S3 as deep storage

pradeepgv42 opened a new issue #5609:
URL: https://github.com/apache/incubator-pinot/issues/5609


   Currently all the segments which are uploaded to S3 are being stored in a single folder, 
   it would be nice to group them into time window based folders.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] pradeepgv42 commented on issue #5609: Organize segments into time based folders when using S3 as deep storage

Posted by GitBox <gi...@apache.org>.
pradeepgv42 commented on issue #5609:
URL: https://github.com/apache/incubator-pinot/issues/5609#issuecomment-651262222


   thanks, yeah that makes sense, better organizing will only be possible if segments are created in a time window respecting manger. One alternative is to organize by max(timestamp) in a window.
   
   This might be better when segment split/merge tasks are available (not sure how exactly this would look like), maybe better to tackle this then.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] pradeepgv42 commented on issue #5609: Organize segments into time based folders when using S3 as deep storage

Posted by GitBox <gi...@apache.org>.
pradeepgv42 commented on issue #5609:
URL: https://github.com/apache/incubator-pinot/issues/5609#issuecomment-648366114


   time window based folders would help in ease of accounting and maintaining the segments.
   For example, 
   * if we want to delete the segments beyond a certain date.
   * understand the segment sizes grouped by time


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] KKcorps commented on issue #5609: Organize segments into time based folders when using S3 as deep storage

Posted by GitBox <gi...@apache.org>.
KKcorps commented on issue #5609:
URL: https://github.com/apache/incubator-pinot/issues/5609#issuecomment-648358328


   @pradeepgv42 can you also provide your use case here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] kishoreg commented on issue #5609: Organize segments into time based folders when using S3 as deep storage

Posted by GitBox <gi...@apache.org>.
kishoreg commented on issue #5609:
URL: https://github.com/apache/incubator-pinot/issues/5609#issuecomment-650837183


   we would love to do this.  one problem is a segment can span across days. What should we do in that case?
   
   Most segments will have data for a single date but during day boundaries, its possible to span across two days. There is another case of late-arriving events that might have data spanning across many dates.
   
   We are working on segment split and merge tasks where we can plan to re-organize the segments according to the day/hour etc.
   
   Would love to hear your thoughts


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org