You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commonsrdf.apache.org by "Stian Soiland-Reyes (JIRA)" <ji...@apache.org> on 2015/10/26 17:08:28 UTC

[jira] [Commented] (COMMONSRDF-17) Size method

    [ https://issues.apache.org/jira/browse/COMMONSRDF-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974464#comment-14974464 ] 

Stian Soiland-Reyes commented on COMMONSRDF-17:
-----------------------------------------------

Should we try to reach a decision on this old issue?

I think the considerations in https://github.com/commons-rdf/commons-rdf/issues/27 assume Commons RDF implementations are only triple stores which would not want to be compatible with Collection<Triple>. We have since also seen that other Java objects might want to deal with a (smaller) bunch of RDF triples, and might be wanting to use Commons RDF for this.  

Thus I think Commons RDF should be a potential basis for anyone who needs to talk about triples in JVM, triple store or not.  How do you deal with a collection of triples? Well, RDF concepts already have Graph - so although such usage might further require merging of those triple collections, I don't see why we need to prevent Graph for being used for this purpose as well. 

Considering also that counting triples can be expensive for those "proper" disk-based triple stores, then the method name "size()" is not very good as it looks like Collection.size (but a long, instead of int) which normally operates in memory at a time scale of no more than a few seconds.

How about renaming Graph.size() to Graph.tripleCount() ? This also shows that there might be actual counting involved, that we are counting triples (not megabytes), and avoids conflict with Collection<Triple> for those that feel the need. 


> Size method
> -----------
>
>                 Key: COMMONSRDF-17
>                 URL: https://issues.apache.org/jira/browse/COMMONSRDF-17
>             Project: Apache Commons RDF
>          Issue Type: Improvement
>            Reporter: Reto Gmür
>
> The size method is problematic for two reasons:
> - it is incompatible with the Collections-API, implementations cannot at the same time implement Collection<Triple> (even though a Graph is a collection of triples).
> - With some types of implementations calculating the exact size of a graph can be very expensive and often the client just requires an approximate size
> So I propose to replace the size method with the following
> [- size: int: same as in Collection.size (returns Integer.MAX_VALUE for bigger graphs) ]
> - exactSize: long: the exact size
> - approximateSize: long: the approximate size
> For all but exactSize the interface can provide default implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)