You are viewing a plain text version of this content. The canonical link for it is here.

Posted to notifications@accumulo.apache.org by "Michael Wall (JIRA)" <ji...@apache.org> on 2017/06/20 20:46:00 UTC

[jira] [Updated] (ACCUMULO-4657) BulkImport Performance Bottleneck

     [ https://issues.apache.org/jira/browse/ACCUMULO-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Wall updated ACCUMULO-4657:
-----------------------------------
    Affects Version/s: 1.7.3
                       1.8.1

> BulkImport Performance Bottleneck
> ---------------------------------
>
>                 Key: ACCUMULO-4657
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4657
>             Project: Accumulo
>          Issue Type: Improvement
>    Affects Versions: 1.7.3, 1.8.1
>            Reporter: Matt Peterson
>            Assignee: Matt Peterson
>            Priority: Minor
>             Fix For: 1.7.4, 1.8.2, 2.0.0
>
>          Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Outputting every "loaded" entry in the table is excessive, especially for tables with multiple simultaneous bulk imports and multiple references to the same file.  This can cause performance problems.  Even when the log level was reduced, there was blocking within log4j.  By doing that check once outside the loop and only logging at trace level, bulk import performance improves for such usages.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)