You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/10/11 04:41:21 UTC

[GitHub] [pinot] SukeshUB opened a new issue, #9567: Last day data in an offline table not being returned by query in a hybrid table when there is no data in real time.

SukeshUB opened a new issue, #9567:
URL: https://github.com/apache/pinot/issues/9567

   I have a hybrid table. I ran the spark offline ingestion job until September 26th (it is October 10th as of this post and the time I found the issue). Now, the ingestion was successful and I see the segments in the pinot UI and in s3.
   
   Spark Output:
   ```
   Response for pushing table enriched_click segment enriched_click_OFFLINE_2022-09-26_2022-09-26_3 to location https://pinot.internal.com.sovrn.startree.cloud/ - 200: {"status":"Su
   ccessfully uploaded segment: enriched_click_OFFLINE_2022-09-26_2022-09-26_3 of table: enriched_click_OFFLINE"}
   ```
   Segments found in table:
   ![image](https://user-images.githubusercontent.com/25570843/194998014-63fd4160-317b-492f-bd25-8d37b246a7d6.png)
   
   Segment metadata:
   ![image](https://user-images.githubusercontent.com/25570843/194998134-2c8a95d9-35ff-431a-bdce-5becc87b7df2.png)
   
   However, data is not returned by the query console when queried for that last date. I can only see data until September 25th when I actually ran the job until the 26th
   
   ![image](https://user-images.githubusercontent.com/25570843/194998259-5de9450a-d2c7-44c9-8900-54fcf4b8ea22.png)
   
   However, I can see the data when I query the OFFLINE table explicitly
   ![image](https://user-images.githubusercontent.com/25570843/194998314-c5569daf-28f5-4588-81d3-53378a286992.png)
   
   As I understand, it is because I don't have any data in my real-time table yet as of this point. And I might see the data for the 27th once I have data in my real-time table. However, I haven't confirmed that yet. Will post an update once I ingest real-time data. But the issue still stands as to why pinot doesn't find the last day data of an offline table if there is no data in a real-time table.
   
   Notes:
   My retention on my real-time was 30 days. Offline was 730 days.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang closed issue #9567: Last day data in an offline table not being returned by query in a hybrid table when there is no data in real time.

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang closed issue #9567: Last day data in an offline table not being returned by query in a hybrid table when there is no data in real time.
URL: https://github.com/apache/pinot/issues/9567


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #9567: Last day data in an offline table not being returned by query in a hybrid table when there is no data in real time.

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #9567:
URL: https://github.com/apache/pinot/issues/9567#issuecomment-1278238973

   Yes, this is expected behavior for hybrid table setup. We expect offline table and real-time table to have at least 1 day's overlap data. This issue is duplicate with #9362. Closing this one and use 9362 to track


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org