You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "keith-turner (via GitHub)" <gi...@apache.org> on 2023/03/31 19:21:07 UTC

[GitHub] [accumulo] keith-turner opened a new issue, #3270: Accumulo offline scanner is not applying time set via bulk import

keith-turner opened a new issue, #3270:
URL: https://github.com/apache/accumulo/issues/3270

   **Describe the bug**
   
   Bulk imports can optionally set a time that will be applied to all keys in the bulk imported files.  This timestamp is applied lazily by a low level iterator whenever a bulk import file is read for scan or compaction.  While working on #3259 it was discovered the [OfflineScanner](https://github.com/apache/accumulo/blob/8c65e026be2db633d8f2a622aabf05db1c2d6c15/core/src/main/java/org/apache/accumulo/core/clientImpl/OfflineIterator.java#L271-L329) does not appear to lazily apply these timestamps.  The OfflineScanner is only used by the Accumulo map reduce when reading an offline table at which point it will directly read the files for a tablet.
   
   **Expected behavior**
   
   The internal OfflineScanner code should lazily apply the timestamps using the TimeSettingIterator and the timestamp from the files entry in the metadata table.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] keith-turner commented on issue #3270: Accumulo offline scanner is not applying time set via bulk import

Posted by "keith-turner (via GitHub)" <gi...@apache.org>.
keith-turner commented on issue #3270:
URL: https://github.com/apache/accumulo/issues/3270#issuecomment-1494337092

   Looking back I think this bug exists in 1.10 forward.
   
   The following commit is from the last 1.10 release and it is not applying the TimeSettingIterator to the file if needed.
   
   https://github.com/apache/accumulo/blob/db2baf1706c0721e25438d5329ef1bba5159c24d/core/src/main/java/org/apache/accumulo/core/client/impl/OfflineIterator.java#L363


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] dlmarion commented on issue #3270: Accumulo offline scanner is not applying time set via bulk import

Posted by "dlmarion (via GitHub)" <gi...@apache.org>.
dlmarion commented on issue #3270:
URL: https://github.com/apache/accumulo/issues/3270#issuecomment-1494142937

   @keith-turner  - which version(s) are affected? Is this only for Bulk Import v2, or for both?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] keith-turner commented on issue #3270: Accumulo offline scanner is not applying time set via bulk import

Posted by "keith-turner (via GitHub)" <gi...@apache.org>.
keith-turner commented on issue #3270:
URL: https://github.com/apache/accumulo/issues/3270#issuecomment-1494348915

   This bug would impact bulk import 1 and 2 if either were calling set time and ([v1 api, the setTime boolean](https://github.com/apache/accumulo/blob/8c65e026be2db633d8f2a622aabf05db1c2d6c15/core/src/main/java/org/apache/accumulo/core/client/admin/TableOperations.java#L583)  [v2 api](https://github.com/apache/accumulo/blob/8c65e026be2db633d8f2a622aabf05db1c2d6c15/core/src/main/java/org/apache/accumulo/core/client/admin/TableOperations.java#L599)) using the Accumulo Map Reduce scan offline table feature.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org