You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jena.apache.org by "Henry Story (Created) (JIRA)" <ji...@apache.org> on 2012/01/29 17:41:10 UTC

[jira] [Created] (JENA-203) support for Non Blocking Parsers

support for Non Blocking Parsers
--------------------------------

                 Key: JENA-203
                 URL: https://issues.apache.org/jira/browse/JENA-203
             Project: Jena
          Issue Type: Improvement
            Reporter: Henry Story


In a Linked Data environment servers have to fetch data off the web. The speed at which such data 
is served can be very slow. So one wants to avoid using up one thread for each connections (1 thread = 
0.5 to 1MB approximately). This is why Java NIO was developed and why servers such as Netty
are so popular, why http client libraries such as https://github.com/sonatype/async-http-client are more
and more numerous, and why framewks such as http://akka.io/ which support relatively lightweight
actors (500 bytes per actor) are growing more viisible.

Unless I am mistaken the only way to parse some content is using methods that use an 
InputStream such as this:

    val m = ModelFactory.createDefaultModel()
     m.getReader(lang.jenaLang).read(m, in, base.toString)

That read call blocks. Would it be possible to have an API which allows
one to parse a document in chunks as they arrive from the input?




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-203) support for Non Blocking Parsers

Posted by "Andy Seaborne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JENA-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219928#comment-13219928 ] 

Andy Seaborne commented on JENA-203:
------------------------------------

Interesting stuff - I need to find a decent block of time to do more than just look.  

To go back to the title of this JIRA ...

What can be done to "support non-blocking parsers" in addition to the current parsers.  It seems to me that the non-block parsers scatter-gather paradigm is a separate subsystem on top of Jena - if there anything the core could provide to help?

What I'd like to see is that Jena does not need to include every feature possible, but can support independent and vibrant open source projects (the developers have already talk a bit about some simple modularity while delivering combined collections in useful forms for common cases, like a single jar with everything in it or a single jar + dependencies to make using the command like tools much easier).

(BTW the n-triples parser link is 404)

                
> support for Non Blocking Parsers
> --------------------------------
>
>                 Key: JENA-203
>                 URL: https://issues.apache.org/jira/browse/JENA-203
>             Project: Apache Jena
>          Issue Type: Improvement
>            Reporter: Henry Story
>
> In a Linked Data environment servers have to fetch data off the web. The speed at which such data 
> is served can be very slow. So one wants to avoid using up one thread for each connections (1 thread = 
> 0.5 to 1MB approximately). This is why Java NIO was developed and why servers such as Netty
> are so popular, why http client libraries such as https://github.com/sonatype/async-http-client are more
> and more numerous, and why framewks such as http://akka.io/ which support relatively lightweight
> actors (500 bytes per actor) are growing more viisible.
> Unless I am mistaken the only way to parse some content is using methods that use an 
> InputStream such as this:
>     val m = ModelFactory.createDefaultModel()
>      m.getReader(lang.jenaLang).read(m, in, base.toString)
> That read call blocks. Would it be possible to have an API which allows
> one to parse a document in chunks as they arrive from the input?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira