You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/11/12 10:38:00 UTC

[jira] [Commented] (JENA-1414) Performance regression in Model.remove(Model m) method

    [ https://issues.apache.org/jira/browse/JENA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248825#comment-16248825 ] 

ASF GitHub Bot commented on JENA-1414:
--------------------------------------

GitHub user afs opened a pull request:

    https://github.com/apache/jena/pull/306

    Algorithms for JENA-1414

    This PR rstructures `GraphUtil.deleteFrom` to put the decision on whether to loop on the target (_dst_) graph or the source (_src_) graph.
    
    There are 3 algorithms for discussion:
    
    1. Use the size of the graphs - this is the current Jena 3.5.0 policy
    2. Use the size of the _src_ and iterate on the _dst_ to compare sizes
    3. Iterator on both _src_ and _dst_ to compare sizes. `Graph.size` is not used at all.
    
    The cost of `Graph.size()` can be small (already known) or large (needs to be calculated to be accurate). The latter is bad for large persistent graphs.
    
    The PR also adds javadoc to explain how to call a specific algorithm (_src_ loop or _dst_ loop).
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/afs/jena graph-deleteFrom

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/jena/pull/306.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #306
    
----
commit 7faca15490448af1679f2705ad820f0baa7f1efe
Author: Andy Seaborne <an...@apache.org>
Date:   2017-11-11T20:10:51Z

    Algorithms for JENA-1414

----


> Performance regression in Model.remove(Model m) method
> ------------------------------------------------------
>
>                 Key: JENA-1414
>                 URL: https://issues.apache.org/jira/browse/JENA-1414
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: Jena 3.3.0, Jena 3.4.0
>            Reporter: Michał Woźniak
>            Assignee: Andy Seaborne
>         Attachments: graph_util_improve.patch
>
>
> The Model.remove(Model) works very slow on large models, as it propagates to GraphUtil.deleteFrom(Graph, Graph), which computes size of the target graph by iterating over all triples. This computation takes nearly 100% of the time of the Model.remove(Model) operation.
> It seems this commit introduced the issue: https://github.com/apache/jena/commit/781895ce64e062c7f2268a78189a777c39b92844#diff-fbb4d11dc804464f94c27e33e11b18e8
> Due to this bug deletion of a concept scheme on a large ontology may take several minutes. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)