You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Michael Wall (JIRA)" <ji...@apache.org> on 2017/06/20 20:44:00 UTC

[jira] [Assigned] (ACCUMULO-4657) BulkImport Performance Bottleneck

     [ https://issues.apache.org/jira/browse/ACCUMULO-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Wall reassigned ACCUMULO-4657:
--------------------------------------

    Assignee: Matt Peterson

> BulkImport Performance Bottleneck
> ---------------------------------
>
>                 Key: ACCUMULO-4657
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4657
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Matt Peterson
>            Assignee: Matt Peterson
>            Priority: Minor
>          Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Outputting every "loaded" entry in the table is excessive, especially for tables with multiple simultaneous bulk imports and multiple references to the same file.  This can cause performance problems.  Even when the log level was reduced, there was blocking within log4j.  By doing that check once outside the loop and only logging at trace level, bulk import performance improves for such usages.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)