You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Paolo Castagna (JIRA)" <ji...@apache.org> on 2011/03/29 12:40:05 UTC

[jira] [Commented] (JENA-44) Support external sorting of bindings in ARQ

    [ https://issues.apache.org/jira/browse/JENA-44?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012420#comment-13012420 ] 

Paolo Castagna commented on JENA-44:
------------------------------------

QueryIterSort extends QueryIterPlainWrapper which extends QueryIter which extends QueryIteratorBase which implements QueryIterator. :-)
QueryIteratorBase has the machinary to deal with query cancellation (i.e. every time hasNext() is called it is checking if the query has been cancelled).
Moreover, QueryIterSort overrides requestCancel() same way as QueryIterGroup does.
Therefore, this patch should play nicely with the query cancellation (i.e. JENA-29).
JENA-44 is not using the serialization for bindings proposed by JENA-45. However, we can commit this and when JENA-45 gets committed we would need to change the BindingSerializer to use that, so we have just one way to serialize bindings.

If this is the case, are there any technical reasons which are holding this to go into trunk?
If not, I am happy to go ahead and commit this.

We are already using a patched ARQ version with JENA-44 and we have not seen problems with it.
It would greatly help us to commit this. Since we wouldn't need to manage separate SNAPSHOTs with this patch applied. 


> Support external sorting of bindings in ARQ
> -------------------------------------------
>
>                 Key: JENA-44
>                 URL: https://issues.apache.org/jira/browse/JENA-44
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Sam Tunnicliffe
>            Assignee: Paolo Castagna
>            Priority: Minor
>         Attachments: JENA-44-0.patch, JENA-44_ARQ_r8531.patch
>
>
> In QueryIterSort, the sorting of the contents of an Iterator<Binding> is done in memory, using Arrays.sort. This can be problematic where the set to be sorted is large. A possible solution could be to use an external, disk-backed algorithm. A hybrid approach may be better, whereby we attempt the in-memory sort, but when the number of bindings encountered goes over a certain number, resort to the disk-backed variant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira