You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Stephen Allen (JIRA)" <ji...@apache.org> on 2011/02/09 18:01:58 UTC

[jira] Created: (JENA-45) Spill to disk SPARQL Update

Spill to disk SPARQL Update
---------------------------

                 Key: JENA-45
                 URL: https://issues.apache.org/jira/browse/JENA-45
             Project: Jena
          Issue Type: New Feature
          Components: ARQ
            Reporter: Stephen Allen


Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.

Items yet to be addressed:
1) Read the threshold and temporary file location from a config file
2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed


[1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E



-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JENA-45) Spill to disk SPARQL Update

Posted by "Stephen Allen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Allen updated JENA-45:
------------------------------

    Attachment: JENA-45-Depends-on-JENA-99-r1157891.patch

Attached a patch using the implementation in JENA-99.

> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch, ARQ-Disk-Backed-Updates-r8657.patch, JENA-45-Depends-on-JENA-99-r1157891.patch, JENA-45-Disk-Backed-Updates-r1156193.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JENA-45) Spill to disk SPARQL Update

Posted by "Stephen Allen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Allen updated JENA-45:
------------------------------

    Attachment: JENA-45-Disk-Backed-Updates-r1156193.patch

Updated patch to use the new binding serialization from JENA-85.

JENA-45-Disk-Backed-Updates-r1156193.patch

> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch, ARQ-Disk-Backed-Updates-r8657.patch, JENA-45-Disk-Backed-Updates-r1156193.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (JENA-45) Spill to disk SPARQL Update

Posted by "Stephen Allen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Allen updated JENA-45:
------------------------------

    Attachment:     (was: ARQ-Disk-Backed-Updates-r8501.patch)

> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (JENA-45) Spill to disk SPARQL Update

Posted by "Stephen Allen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Allen updated JENA-45:
------------------------------

    Attachment: ARQ-Disk-Backed-Updates-r8501.patch

Patch against trunk revision 8501.

> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: ARQ-Disk-Backed-Updates-r8501.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JENA-45) Spill to disk SPARQL Update

Posted by "Stephen Allen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Allen updated JENA-45:
------------------------------

    Attachment: JENA-45-ARQ_r1165687.patch

I made a few updates.   Similar to JENA-44:

1) -1 to shut off spilling
2) Long instead of Integer for spillOnDiskUpdateThreshold
3) Removed any default policy from BagFactory, now you have to explicitly pass one in every time

You may have to apply the patch in JENA-44 before this one will apply cleanly.

P.S.  Also had to make a change to NodeFormatterNT (which was just added last week revision #1165123) in order for bags of Triples to work.  I think the problem was just a typo, so I've included it with this patch.



> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Paolo Castagna
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch, ARQ-Disk-Backed-Updates-r8657.patch, JENA-45-ARQ_r1165687.patch, JENA-45-Depends-on-JENA-99-r1157891.patch, JENA-45-Disk-Backed-Updates-r1156193.patch, JENA-45_ARQ_r1165123.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JENA-45) Spill to disk SPARQL Update

Posted by "Stephen Allen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993202#comment-12993202 ] 

Stephen Allen commented on JENA-45:
-----------------------------------

I don't know when a committer will get a chance to look at the patch, so I don't know if I can answer your question with authority.  That being said, I've corresponded with Andy about the design (see link in issue description) and I believe I've implemented the feature according to his inputs, so hopefully there wouldn't have to be too many changes to commit it to the main project. 

The one area that needs changes (and was hoping to leave to Andy) was specifying the temporary directory and threshold via a configuration file.

> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (JENA-45) Spill to disk SPARQL Update

Posted by "Stephen Allen (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Allen resolved JENA-45.
-------------------------------

    Resolution: Fixed

The changes are checked in so this feature is complete.
                
> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Andy Seaborne
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch, ARQ-Disk-Backed-Updates-r8657.patch, JENA-45-ARQ_r1165687.patch, JENA-45-Depends-on-JENA-99-r1157891.patch, JENA-45-Disk-Backed-Updates-r1156193.patch, JENA-45_ARQ_r1165123.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JENA-45) Spill to disk SPARQL Update

Posted by "Stephen Allen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992611#comment-12992611 ] 

Stephen Allen commented on JENA-45:
-----------------------------------

Looks like there is a fair amount of overlap between this and JENA-44, at least as far as being solutions to serializing bindings to disk.

> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: ARQ-Disk-Backed-Updates-r8501.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (JENA-45) Spill to disk SPARQL Update

Posted by "Stephen Allen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Allen updated JENA-45:
------------------------------

    Attachment: ARQ-Disk-Backed-Updates-r8504.patch

Updated patch to implement spill to disk SPARQL Update against trunk revision 8504.

> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Commented] (JENA-45) Spill to disk SPARQL Update

Posted by Paolo Castagna <ca...@googlemail.com>.
Paolo Castagna (JIRA) wrote:
> Hi Stephen, first of all thanks for the patch.
> The patch does not apply cleanly with current ARQ trunk. There are problems with UpdateEngineWorker.
> I tried to fix those, but now I have two failures in TestUpdateGraphMem: testModify2 and testCopy.

I also have tried this:
svn co https://jena.svn.sourceforge.net/svnroot/jena/ARQ/trunk/@8504 ARQ
wget https://issues.apache.org/jira/secure/attachment/12470799/ARQ-Disk-Backed-Updates-r8504.patch
cd ARQ
patch -p0 < ../ARQ-Disk-Backed-Updates-r8504.patch
mvn test

This way there are compile errors.

The patch did not apply cleanly therefore I am not sure it has generated from the r8504 revision.

Paolo

[jira] [Commented] (JENA-45) Spill to disk SPARQL Update

Posted by "Paolo Castagna (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012366#comment-13012366 ] 

Paolo Castagna commented on JENA-45:
------------------------------------

Hi Stephen, first of all thanks for the patch.
The patch does not apply cleanly with current ARQ trunk. There are problems with UpdateEngineWorker.
I tried to fix those, but now I have two failures in TestUpdateGraphMem: testModify2 and testCopy.


> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (JENA-45) Spill to disk SPARQL Update

Posted by "Paolo Castagna (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paolo Castagna reassigned JENA-45:
----------------------------------

    Assignee: Paolo Castagna

> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Paolo Castagna
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch, ARQ-Disk-Backed-Updates-r8657.patch, JENA-45-Depends-on-JENA-99-r1157891.patch, JENA-45-Disk-Backed-Updates-r1156193.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-45) Spill to disk SPARQL Update

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034329#comment-13034329 ] 

Andy Seaborne commented on JENA-45:
-----------------------------------

Comments added to JENA-44 - wondering what commonality (some basic operations? library?) we can find.

As some aspects may be performance critical (I/O, serializing/deserializing) its worth finding commonality so it can be optimized and benefit both JENA-44 and JENA-45.


> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch, ARQ-Disk-Backed-Updates-r8657.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (JENA-45) Spill to disk SPARQL Update

Posted by "Stephen Allen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Allen updated JENA-45:
------------------------------

    Attachment: ARQ-Disk-Backed-Updates-r8657.patch

Updated patch against trunk revision 8657.

> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch, ARQ-Disk-Backed-Updates-r8657.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Closed] (JENA-45) Spill to disk SPARQL Update

Posted by "Stephen Allen (Closed) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Allen closed JENA-45.
-----------------------------

    
> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Andy Seaborne
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch, ARQ-Disk-Backed-Updates-r8657.patch, JENA-45-ARQ_r1165687.patch, JENA-45-Depends-on-JENA-99-r1157891.patch, JENA-45-Disk-Backed-Updates-r1156193.patch, JENA-45_ARQ_r1165123.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-45) Spill to disk SPARQL Update

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101595#comment-13101595 ] 

Andy Seaborne commented on JENA-45:
-----------------------------------

I tried to apply "JENA-45-ARQ_r1165687.patch" against the codebase, after JENA-44 had been applied.

All worked except for class BagFactory, where the patch does not match and I had to apply the change by hand.  Stephen, could you check I've got that right (it compiles and passes tests).  The code is checked in with the patch applied.

> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Andy Seaborne
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch, ARQ-Disk-Backed-Updates-r8657.patch, JENA-45-ARQ_r1165687.patch, JENA-45-Depends-on-JENA-99-r1157891.patch, JENA-45-Disk-Backed-Updates-r1156193.patch, JENA-45_ARQ_r1165123.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-45) Spill to disk SPARQL Update

Posted by "Andy Seaborne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101598#comment-13101598 ] 

Andy Seaborne commented on JENA-45:
-----------------------------------

Shouldn't Symbol spillOnDiskUpdateThreshold be in ARQ?

What should go in ChangeLog.txt?


> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Andy Seaborne
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch, ARQ-Disk-Backed-Updates-r8657.patch, JENA-45-ARQ_r1165687.patch, JENA-45-Depends-on-JENA-99-r1157891.patch, JENA-45-Disk-Backed-Updates-r1156193.patch, JENA-45_ARQ_r1165123.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (JENA-45) Spill to disk SPARQL Update

Posted by "Sam Tunnicliffe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992922#comment-12992922 ] 

Sam Tunnicliffe commented on JENA-45:
-------------------------------------

I think it should be pretty straightforward to replace the serialisation mechanism in JENA-44 with the (more sophisticated) one implemented here. Is that something worth me doing now, or should I hold off while this patch is still uncommitted?

> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: ARQ-Disk-Backed-Updates-r8501.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (JENA-45) Spill to disk SPARQL Update

Posted by "Paolo Castagna (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paolo Castagna reassigned JENA-45:
----------------------------------

    Assignee: Andy Seaborne  (was: Paolo Castagna)

> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Andy Seaborne
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch, ARQ-Disk-Backed-Updates-r8657.patch, JENA-45-ARQ_r1165687.patch, JENA-45-Depends-on-JENA-99-r1157891.patch, JENA-45-Disk-Backed-Updates-r1156193.patch, JENA-45_ARQ_r1165123.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JENA-45) Spill to disk SPARQL Update

Posted by "Paolo Castagna (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paolo Castagna updated JENA-45:
-------------------------------

    Attachment: JENA-45_ARQ_r1165123.patch

UpdateEngineWorker now uses a Symbol (i.e. spillOnDiskUpdateThreshold) to get the threshold to decide when to spill on disk. The value defaults to Integer.MAX_VALUE (i.e. feature is off by default).

Anything left on this one?

> Spill to disk SPARQL Update
> ---------------------------
>
>                 Key: JENA-45
>                 URL: https://issues.apache.org/jira/browse/JENA-45
>             Project: Jena
>          Issue Type: New Feature
>          Components: ARQ
>            Reporter: Stephen Allen
>            Assignee: Paolo Castagna
>         Attachments: ARQ-Disk-Backed-Updates-r8504.patch, ARQ-Disk-Backed-Updates-r8657.patch, JENA-45-Depends-on-JENA-99-r1157891.patch, JENA-45-Disk-Backed-Updates-r1156193.patch, JENA-45_ARQ_r1165123.patch
>
>
> Attached is a patch that implements a spill-to-disk container for performing SPARQL Update commands.  It utilizes a DeferredFileQueue that can serialize/deserialize Bindings, Quads, and Triples.  Also included are some unit tests.
> Items yet to be addressed:
> 1) Read the threshold and temporary file location from a config file
> 2) Examine compression of the bindings/quads/triples to see if we can improve I/O speed
> [1] http://mail-archives.apache.org/mod_mbox/incubator-jena-dev/201102.mbox/%3C003901cbc22e$2e5c5bc0$8b151340$@com%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira