You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2019/12/19 03:21:51 UTC

[GitHub] [accumulo] arvindshmicrosoft opened a new issue #1464: Leverage SimpleThreadPool for improving importable performance

arvindshmicrosoft opened a new issue #1464: Leverage SimpleThreadPool for improving importable performance
URL: https://github.com/apache/accumulo/issues/1464
 
 
   Importtable reads the mappings.txt file (which in turn is created [here](https://github.com/apache/accumulo/blob/1ed12fa585a37caf36cfd674bd7ad6903c1d4e79/server/master/src/main/java/org/apache/accumulo/master/tableOps/tableImport/MapImportFileNames.java#L69-L93)) and [renames](https://github.com/apache/accumulo/blob/1ed12fa585a37caf36cfd674bd7ad6903c1d4e79/server/master/src/main/java/org/apache/accumulo/master/tableOps/tableImport/MoveExportedFiles.java#L64-L69) files. The rename calls are done on a single calling thread, and thereby it does not bode well if there are large numbers (1000s) of files to be processed.
   
   I believe it would help to implement a thread pool similar to what is in place for [Bulk Import](https://github.com/apache/accumulo/blob/1ed12fa585a37caf36cfd674bd7ad6903c1d4e79/server/master/src/main/java/org/apache/accumulo/master/tableOps/bulkVer2/BulkImportMove.java#L111-L113) so that the operation scales.
   
   Before going down that path, I would appreciate any inputs / feedback on this proposal, so that I can accordingly target my efforts.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [accumulo] arvindshmicrosoft commented on issue #1464: Leverage SimpleThreadPool for improving importtable performance

Posted by GitBox <gi...@apache.org>.
arvindshmicrosoft commented on issue #1464: Leverage SimpleThreadPool for improving importtable performance
URL: https://github.com/apache/accumulo/issues/1464#issuecomment-585007111
 
 
   FYI I'm starting to work on this, so if this issue can be assigned to me it would be appreciated.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [accumulo] milleruntime closed issue #1464: Leverage SimpleThreadPool for improving importtable performance

Posted by GitBox <gi...@apache.org>.
milleruntime closed issue #1464: Leverage SimpleThreadPool for improving importtable performance
URL: https://github.com/apache/accumulo/issues/1464
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services