You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "mneedham (via GitHub)" <gi...@apache.org> on 2023/06/22 13:55:28 UTC

[GitHub] [pinot] mneedham opened a new pull request, #10961: Backfill segments into real-time table

mneedham opened a new pull request, #10961:
URL: https://github.com/apache/pinot/pull/10961

   I want to backfill segments into a real-time table, but the code in `SegmentGenerationUtils#getTableConfig` seems to be designed to only allow that for offline tables.
   
   The stub code in SegmentGenerationJobRunnerTest.java. doesn't properly reflect the actual values returned by the API. The stub returns:
   
   ```
   {
   "tableName": "events_rc_OFFLINE",
   "tableType": "OFFLINE",
   ...
   ```
   
   which misses the nesting that the API actually returns:
   
   ```
   {
   "OFFLINE": {
     "tableName": "events_rc_OFFLINE",
     "tableType": "OFFLINE",
   ```
   
   This PR updates those tests as well.
   
   I wasn't completely sure what to do if the user tries to upload a segment to a hybrid table without indicating if it should for the real-time or offline table, so we throw an exception in that case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on pull request #10961: Backfill segments into real-time table

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on PR #10961:
URL: https://github.com/apache/pinot/pull/10961#issuecomment-1608598720

   cc @snleee 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] eaugene commented on pull request #10961: Backfill segments into real-time table

Posted by "eaugene (via GitHub)" <gi...@apache.org>.
eaugene commented on PR #10961:
URL: https://github.com/apache/pinot/pull/10961#issuecomment-1603875645

   @mneedham  Can you share about the following :
   1. When real-time segments are uploaded, There's a chance we may end up with two segments having overlapping timestamps, how would this be handled? 
   2. did you get a chance to test by uploading segments of the upsert tables - curious to know about the case as we put the same partitions' primary key records in a single pinot server to have de-dup done. So when uploading how we can ensure, we can only put the segment in a specific server?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mneedham commented on pull request #10961: Backfill segments into real-time table

Posted by "mneedham (via GitHub)" <gi...@apache.org>.
mneedham commented on PR #10961:
URL: https://github.com/apache/pinot/pull/10961#issuecomment-1617506391

   1) I'm not sure what would happen in that case - I expect we'd end up with duplicates as Pinot doesn't expect there to be overlap. So the onus would be on the user uploading the segment to make sure it doesn't overlap
   2) I haven't tried it with upserts, so I don't know the answer to that.
   
   I think @snleee has more knowledge than me to answer the questions!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org