You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Andy Seaborne (Jira)" <ji...@apache.org> on 2020/09/18 14:39:00 UTC

[jira] [Comment Edited] (JENA-1965) Writing streams of RDF

    [ https://issues.apache.org/jira/browse/JENA-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198367#comment-17198367 ] 

Andy Seaborne edited comment on JENA-1965 at 9/18/20, 2:38 PM:
---------------------------------------------------------------

{{ResultSet}} is an iterator for pulling the results but how it is implemented is not API. In the query engine, the next row is generated on demand.

For the parallel sender use case,it will need more than a delivery API to do the sync.

Maybe a {{ResultSet}} implementation with a {{Queue}} (bounded) or maybe a passing in a  {{Consumer<QuerySoution>}} as a call back. Or both due to manage the concurrency.

c.f. {{RDFConnection.querySelect(... , Consumer<QuerySolution>)}}. 

(There is a Graph/Triple/Node equivalent in a seperate project but all it's networking is using the Java11 HTTP APIs.)


was (Author: andy.seaborne):
{{ResultSet}} is an iterator for pulling the results but how it is implemented is not API. In the query engine, the next row is generated on demand.

For the parallel sender use case,it will need more than a delivery API to do the sync.

Maybe a {{ResultSet}} implementation with a {{Deque}} (bounded) or maybe a passing in a  {{Consumer<QuerySoution>}} as a call back. Or both due to manage the concurrency.

c.f. {{RDFConnection.querySelect(... , Consumer<QuerySolution>)}}. 

(There is a Graph/Triple/Node equivalent in a seperate project but all it's networking is using the Java11 HTTP APIs.)

> Writing streams of RDF
> ----------------------
>
>                 Key: JENA-1965
>                 URL: https://issues.apache.org/jira/browse/JENA-1965
>             Project: Apache Jena
>          Issue Type: New Feature
>          Components: RIOT
>    Affects Versions: Jena 3.16.0
>            Reporter: Claus Stadler
>            Priority: Major
>
> For streams of Model and Datesets (or Graphs and DatasetGraphs) there does not appear to be a 'push'-based RDF Writer.
> Although there exists the deprecated method:
> {code:java}
> WriterDatasetRIOT ds = RDFDataMgr.createDatasetWriter(RDFFormat.TURTLE_PRETTY);
> {code}
> The documentation states that the returned object is for one-time use only.
> The feature request is to make it possible to write out streams of Datasets in a  push-based manner. Thereby the writer should maintain state information such that prefixes and base IRIs are not written out redundantly.
> {code:java}
> try(OutputStream out = ...} {
>   StreamWriterDatasetRIOT sink = RDFDataMgr.createStreamDatasetWriter(out,   RDFFormat.TURTLE_PRETTY);
>   sink.start(); // May immediately trigger a write on the output stream
>   for (Dataset ds : streamOfDatasets) {
>     sink.send(ds);
>     sink.flush();
>   }
>   sink.finish(); // Write out footer and free resources 
>   // Is tthere is a need for sink.close()?
> } // close resources
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)