You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jena.apache.org by rvesse <gi...@git.apache.org> on 2017/02/01 10:29:01 UTC

[GitHub] jena issue #212: JENA-1284: Improvements for bulk graph operations in GraphU...

Github user rvesse commented on the issue:

    https://github.com/apache/jena/pull/212
  
    I presume this was motivated by a real-world performance concern? It would be interesting to know how much difference it makes.
    
    My only concern is what happens when there is only a slight difference between the size of the affected graphs? It's looks like when there is a slight difference then you potentially do twice the work because on one code path you do both a `find()` and a `delete()`/`add()` for every triple.
    
    Would it be worth making the behaviour based a configurable percentage difference e.g. If the graphs are within 10% of each others size don't use the new path.
    
    Also is materialising the list of triples a potential memory issue when the iterator is over a large amount of data? Click


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---