You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Martynas Jusevičius <ma...@atomgraph.com> on 2020/03/26 13:36:05 UTC

Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Hi,

I'm working on a long overdue upgrade of Jena.

So far the area where I can see most changes will be needed is the
implementation of ReaderRIOT streaming parser for RDF/POST:
https://github.com/AtomGraph/Core/blob/master/src/main/java/com/atomgraph/core/riot/lang/RDFPostReader.java

Is LangEngine the recommended base class for such parsers these days?
https://github.com/apache/jena/blob/master/jena-arq/src/main/java/org/apache/jena/riot/lang/LangEngine.java

Currently it's extending ReaderRIOTBase.

Also can't figure out what to replace
ParserProfile.getPrologue().setBaseURI() calls with. I can see the
latest LangTurtleBase uses ParserProfile.setBaseIRI(), but I can't
find such method in 3.13.1.

Thanks.

Martynas

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Andy Seaborne <an...@apache.org>.

On 27/03/2020 11:53, Martynas Jusevičius wrote:
> Rob,
> 
> my immediate problem is different HttpClient versions from Jersey
> (4.1.1) and Jena (4.2.6).

Jena 3.13.x is depending on httpclient 4.5.10 / http core 4.4.9

org.eclipse.ee4j Jersey is 2.30.1 using httpclient 4.5.9

     Andy

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Martynas Jusevičius <ma...@atomgraph.com>.
Rob,

my immediate problem is different HttpClient versions from Jersey
(4.1.1) and Jena (4.2.6).


Re. JAX-RS interfaces, it was a suggestion how to split parsers from
actual I/O, which is currently mixed up in Jena IMO. Let me give you
an example:

@Provider // reads/writes all supported RDF media types by invoking
Jena's parsers
public class ModelProvider implements MessageBodyReader<Model>,
MessageBodyWriter<Model>
{
    public Model readFrom(Class<Model> type, Type genericType,
Annotation[] annotations, MediaType mediaType, MultivaluedMap<String,
String> httpHeaders, InputStream entityStream) throws IOException
    public Model read(Model model, InputStream is, Lang lang, String baseURI)
    public Model read(Model model, InputStream is, Lang lang, String
baseURI, ErrorHandler errorHandler, ParserProfile parserProfile) ...

    public void writeTo(Model model, Class<?> type, Type genericType,
Annotation[] annotations, MediaType mediaType, MultivaluedMap<String,
Object> httpHeaders, OutputStream entityStream) throws IOException
    public Model write(Model model, OutputStream os, Lang lang, String
baseURI)
}

Then using Jersey Client, you can simply do

    ClientConfig config = new DefaultClientConfig();
    config.getSingletons().add(new ModelProvider());
    Client client = Client.create(config);
    client.addFilter(new HTTPBasicAuthFilter(usr, pwd));
    Model m = client.resource(uri).get(Model.class);

Both the server *and* the client support the MessageBodyReader/Writer
providers. They are chosen automatically behind the scenes and are
orthogonal to the actual HTTP interaction. The client code doesn't
even get to see the RDFParser, gets a Model straight away.

I think it's safe to say that Jersey Client provides all the HTTP
features you mention, while also providing a higher level of
abstraction than plain HttpClient, and more modular and orthogonal
components than Jena's current HTTP I/O.
I've added HTTPBasicAuthFilter for this purpose, but obviously
different kinds of filters can be plugged in.

Note that this is using Jersey 1.9, the latest 2.x is probably even
more advanced: https://eclipse-ee4j.github.io/jersey.github.io/documentation/latest/client.html

On Fri, Mar 27, 2020 at 12:21 PM Rob Vesse <rv...@dotnetrdf.org> wrote:
>
> Martynas
>
> I don't see what bearing JAX-RS has on the HttpClient part of this discussion
>
> Jena needs something to manage the HTTP connections regardless of how it reads and writes data over those connections.  A lot of users have use cases that require authenticating themselves to their remote services and provide direct access to the HTTP client gives users the ability to implement whatever authentication mechanism they need for their environment.  It also lets user choose how they do connection pooling, max connection limits, custom SSL setups etc.
>
> Rob
>
> On 27/03/2020, 10:46, "Martynas Jusevičius" <ma...@atomgraph.com> wrote:
>
>     Andy,
>
>     if I don't exclude Jena's HttpClient, it's going to clash with the
>     HttpClient that Jersey includes, as the versions are almost certainly
>     different. That's why I did it in the first place.
>
>     What do I do then?
>
>     I had proposed multiple times that Jena adopts JAX-RS
>     MessageBodyReader/Writer approach:
>     https://docs.oracle.com/javaee/7/api/javax/ws/rs/ext/MessageBodyReader.html
>     https://docs.oracle.com/javaee/7/api/javax/ws/rs/ext/MessageBodyWriter.html
>
>     They cleanly separate the I/O for different media types.
>
>     On Fri, Mar 27, 2020 at 11:37 AM Andy Seaborne <an...@apache.org> wrote:
>     >
>     > 3.0.1 was a long time ago!
>     >
>     > All the parsing code goes through RDFParser whichever API route you try.
>     >
>     > RDFParser.fromString(validRDFPost)
>     >     .base("http://base")
>     >     .lang(RDFPOST)
>     >     .parse(parsed);
>     >
>     > The code needs to read URLs and provide control of the HTTP connection
>     > to be used so the user can setup whatever HTTP interaction they need.
>     >
>     > That's HttpClient - the built-in networking isn't very easy to provide
>     > per-connection controls.
>     >
>     > HttpClient is null.
>     > Don't exclude HttpClient.
>     > There isn't a common abstraction acorss HTTP libraries.
>     >
>     >
>     > I have had a go at switching all network interaction to using
>     > java.net.http (Java11-onwards). As Jena currently uses HttPClient v4,
>     > and the v5 interfaces are quite different change was eventually coming
>     > anyway.
>     >
>     > But obv - Java11.
>     >
>     > https://github.com/afs/jena-http
>     >
>     >      Andy
>     >
>     >
>     > On 26/03/2020 23:11, Martynas Jusevičius wrote:
>     >  > Thanks.
>     >  >
>     >  > I'm trying 3.14.0 for now. Refactored RDFPostReader without much
>     >  > trouble, but now I get
>     >  >
>     >  > java.lang.NoClassDefFoundError: org/apache/http/client/HttpClient
>     >  > at org.apache.jena.riot.RDFParser.create(RDFParser.java:114)
>     >  >
>     >  > because RDFParser now has a field of type HttpClient :/ I guess it
>     >  > didn't in 3.0.1.
>     >  >
>     >  > I had excluded HTTP Client because Core is using Jersey Client:
>     >  >
>     >  >          <dependency>
>     >  >              <groupId>org.apache.jena</groupId>
>     >  >              <artifactId>jena-arq</artifactId>
>     >  >              <version>3.14.0</version>
>     >  >              <exclusions>
>     >  >                  <exclusion>
>     >  >                      <groupId>org.apache.httpcomponents</groupId>
>     >  >                      <artifactId>httpclient</artifactId>
>     >  >                  </exclusion>
>     >  >                  <exclusion>
>     >  >                      <groupId>org.apache.httpcomponents</groupId>
>     >  >                      <artifactId>httpclient-cache</artifactId>
>     >  >                  </exclusion>
>     >  >              </exclusions>
>     >  >          </dependency>
>     >  >
>     >  > This makes me wonder why the parsers/writers are dealing directly with
>     >  > HTTP. I already had this problem with the JsonLDWriter:
>     >  >
>     > https://mail-archives.apache.org/mod_mbox/jena-users/201712.mbox/%3cCAE35VmyGk-biQ1Fayp3zoOyiMikZhvjA8dTuGEd3JaYR98uYOA@mail.gmail.com%3e
>     >  >
>     >  > Not sure how to proceed.
>     >  >
>
>
>
>
>

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Andy Seaborne <an...@apache.org>.

On 27/03/2020 15:30, Martynas Jusevičius wrote:
> This is getting off-topic, but I want to follow up on a few things.
> 
>> If the input is 1B triples, it can't go into a default model - straight
>> to database is preferable even if it can be buffered.
> 
> What prevents having a MessageBodyReader<StreamRDF>?

Wrong way round. StreamRDF is the destination. You want to pass it a 
pull interface, not a push interface and will need a push-pull conversion.

It could return a pull type interface like Iterator<> or java Stream<>.

Now you need a pull parser because Iterators pull, and the pull step 
moves one or several triples in the output than pauses.

Passing in a StreamRDF, as ReaderRIOT does, means the parser is 
controlling the tempo.  Invert that and there will have to be a pair of 
threads or rewrite all the Jena parsers.

And you will need to consume the InputStream, i.e. the caller must 
finish using the Iterator before starting the HTTP response (see 
Mikael's message of 4th March).

>> What would be useful is to contribute a JAX-RS module for those people
>> that want to use Apache Jena in a JAX-RS/J2EE situation and can work
>> with the JAX-RS setup of returning a fresh Model (well, Graph is better).
>>
> 
> That is what AtomGraph Core does. Just needs upgrading...
> 
>>
>> The HttpClient issue is dependency management of version, not a specific
>> issue.
> 
> I understand. But is there a solution, if the HttpClient versions are
> not compatible?

Upgrade
or change the Jena code to not touch HttpClient (make it an Object) or 
use the version of your choice (but 4.1.1 is rather different).

     Andy

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Martynas Jusevičius <ma...@atomgraph.com>.
This is getting off-topic, but I want to follow up on a few things.

> If the input is 1B triples, it can't go into a default model - straight
> to database is preferable even if it can be buffered.

What prevents having a MessageBodyReader<StreamRDF>?

> What would be useful is to contribute a JAX-RS module for those people
> that want to use Apache Jena in a JAX-RS/J2EE situation and can work
> with the JAX-RS setup of returning a fresh Model (well, Graph is better).
>

That is what AtomGraph Core does. Just needs upgrading...

>
> The HttpClient issue is dependency management of version, not a specific
> issue.

I understand. But is there a solution, if the HttpClient versions are
not compatible?

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Andy Seaborne <an...@apache.org>.
The basic interface to the parsers is ReaderRIOT.

A parser = grammar + tokenizer + object maker policy.

It takes an input stream, base URI and outputs triples etc to StreamRDF.

It is created with a ParserProfile that handles the policy of 
bytes->tokens/strings->RDF terms, triples and quads.

No HttpClient in ReaderRIOT.


There is a registry for Lang to per-language parser factory
Other code decides the language - not this level of the system.

HTTP headers is one way of achieving that - it is not the only way - 
files, for example.  Or forcing the choice because, real world, HTTP 
headers are sometimes wrong.


RDFParser provides a way to get all the parsing machinery together with 
defaults and using the parser factory registry.

       RDFParser.source("uri").parse(dest);

"dest" can be an existing model, a dataset, a graph in an existing 
dataset, or an StreamRDF.  It separates storage management from parsing.

Use of RDFParser is optional. The reader factory registry is public.

StreamRDF can be streaming, or collecting and processing.
Application choice.


RDFDataMgr collects up some common cases into one-liners.  Again, 
optional. Ditto model.read.


MessageBodyReader is opinionated for the HTTP/J2EE use case.

It is not designed for streaming large data, creates intermediate buffer 
copies to return a <T>. JAX-RS is the Lang chooser for HTTP.

If the input is 1B triples, it can't go into a default model - straight 
to database is preferable even if it can be buffered.


What would be useful is to contribute a JAX-RS module for those people 
that want to use Apache Jena in a JAX-RS/J2EE situation and can work 
with the JAX-RS setup of returning a fresh Model (well, Graph is better).


The HttpClient issue is dependency management of version, not a specific 
issue.

     Andy

On 27/03/2020 12:39, Martynas Jusevičius wrote:
> On Fri, Mar 27, 2020 at 1:17 PM Andy Seaborne <an...@apache.org> wrote:
>> Jersey is currently using httpclient 4.5.9 - what makes you think it
>> won't not work?
> 
> With the latest Jersey - maybe. But Core is currently on Jersey 1.19...
> 
> I was hoping I could upgrade one dependency at a time, but it looks
> like I'll need to do both.
> 
>>
>> The point of using java.net.http is to build on a solid, available (and
>> no dependency) base.
> 
> These are good points. But if you don't want Jersey as a dependency,
> it could only depend on the JAX-RS interfaces and roll Jena's own
> client implementation. All the JAX-RS frameworks would be
> automatically compatible with the providers and able to read Jena's
> objects.
> 
> JAX-RS is "Java API for RESTful Web Services". Jena is written in
> Java, and the resources it reads is usually Linked Data, which is
> REST. Makes total sense to me.
> 
> I cannot stress the importance of providers enough. They hide parsing
> from HTTP client and hide HTTP from parsers (by supplying InputStream
> and header map only).
> What I've shown in the example was a plain ModelProvider that
> reads/writes Models. But we have its subclass what do constraint
> validation, for example. They plugin simply by registering a different
> provider, the rest of the code stays absolutely the same.
> Achieving the same modularity with Jena would currently be impossible.
> Do you not agree?
> 

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Martynas Jusevičius <ma...@atomgraph.com>.
On Fri, Mar 27, 2020 at 1:17 PM Andy Seaborne <an...@apache.org> wrote:
> Jersey is currently using httpclient 4.5.9 - what makes you think it
> won't not work?

With the latest Jersey - maybe. But Core is currently on Jersey 1.19...

I was hoping I could upgrade one dependency at a time, but it looks
like I'll need to do both.

>
> The point of using java.net.http is to build on a solid, available (and
> no dependency) base.

These are good points. But if you don't want Jersey as a dependency,
it could only depend on the JAX-RS interfaces and roll Jena's own
client implementation. All the JAX-RS frameworks would be
automatically compatible with the providers and able to read Jena's
objects.

JAX-RS is "Java API for RESTful Web Services". Jena is written in
Java, and the resources it reads is usually Linked Data, which is
REST. Makes total sense to me.

I cannot stress the importance of providers enough. They hide parsing
from HTTP client and hide HTTP from parsers (by supplying InputStream
and header map only).
What I've shown in the example was a plain ModelProvider that
reads/writes Models. But we have its subclass what do constraint
validation, for example. They plugin simply by registering a different
provider, the rest of the code stays absolutely the same.
Achieving the same modularity with Jena would currently be impossible.
Do you not agree?

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Andy Seaborne <an...@apache.org>.

On 27/03/2020 11:20, Rob Vesse wrote:
> Martynas
> 
> I don't see what bearing JAX-RS has on the HttpClient part of this discussion
> 
> Jena needs something to manage the HTTP connections regardless of how it reads and writes data over those connections.  A lot of users have use cases that require authenticating themselves to their remote services and provide direct access to the HTTP client gives users the ability to implement whatever authentication mechanism they need for their environment.  It also lets user choose how they do connection pooling, max connection limits, custom SSL setups etc.
> 
> Rob
> 
> On 27/03/2020, 10:46, "Martynas Jusevičius" <ma...@atomgraph.com> wrote:
> 
>      Andy,
>      
>      if I don't exclude Jena's HttpClient, it's going to clash with the
>      HttpClient that Jersey includes, as the versions are almost certainly
>      different.

Jersey is currently using httpclient 4.5.9 - what makes you think it 
won't not work?

>      That's why I did it in the first place.
>      
>      What do I do then?
>      
>      I had proposed multiple times that Jena adopts JAX-RS
>      MessageBodyReader/Writer approach:
>      https://docs.oracle.com/javaee/7/api/javax/ws/rs/ext/MessageBodyReader.html
>      https://docs.oracle.com/javaee/7/api/javax/ws/rs/ext/MessageBodyWriter.html

The point of using java.net.http is to build on a solid, available (and 
no dependency) base.

>      
>      They cleanly separate the I/O for different media types.
>      
>      On Fri, Mar 27, 2020 at 11:37 AM Andy Seaborne <an...@apache.org> wrote:
>      >
>      > 3.0.1 was a long time ago!
>      >
>      > All the parsing code goes through RDFParser whichever API route you try.
>      >
>      > RDFParser.fromString(validRDFPost)
>      >     .base("http://base")
>      >     .lang(RDFPOST)
>      >     .parse(parsed);
>      >
>      > The code needs to read URLs and provide control of the HTTP connection
>      > to be used so the user can setup whatever HTTP interaction they need.
>      >
>      > That's HttpClient - the built-in networking isn't very easy to provide
>      > per-connection controls.
>      >
>      > HttpClient is null.
>      > Don't exclude HttpClient.
>      > There isn't a common abstraction acorss HTTP libraries.
>      >
>      >
>      > I have had a go at switching all network interaction to using
>      > java.net.http (Java11-onwards). As Jena currently uses HttPClient v4,
>      > and the v5 interfaces are quite different change was eventually coming
>      > anyway.
>      >
>      > But obv - Java11.
>      >
>      > https://github.com/afs/jena-http
>      >
>      >      Andy
>      >
>      >
>      > On 26/03/2020 23:11, Martynas Jusevičius wrote:
>      >  > Thanks.
>      >  >
>      >  > I'm trying 3.14.0 for now. Refactored RDFPostReader without much
>      >  > trouble, but now I get
>      >  >
>      >  > java.lang.NoClassDefFoundError: org/apache/http/client/HttpClient
>      >  > at org.apache.jena.riot.RDFParser.create(RDFParser.java:114)
>      >  >
>      >  > because RDFParser now has a field of type HttpClient :/ I guess it
>      >  > didn't in 3.0.1.
>      >  >
>      >  > I had excluded HTTP Client because Core is using Jersey Client:
>      >  >
>      >  >          <dependency>
>      >  >              <groupId>org.apache.jena</groupId>
>      >  >              <artifactId>jena-arq</artifactId>
>      >  >              <version>3.14.0</version>
>      >  >              <exclusions>
>      >  >                  <exclusion>
>      >  >                      <groupId>org.apache.httpcomponents</groupId>
>      >  >                      <artifactId>httpclient</artifactId>
>      >  >                  </exclusion>
>      >  >                  <exclusion>
>      >  >                      <groupId>org.apache.httpcomponents</groupId>
>      >  >                      <artifactId>httpclient-cache</artifactId>
>      >  >                  </exclusion>
>      >  >              </exclusions>
>      >  >          </dependency>
>      >  >
>      >  > This makes me wonder why the parsers/writers are dealing directly with
>      >  > HTTP. I already had this problem with the JsonLDWriter:
>      >  >
>      > https://mail-archives.apache.org/mod_mbox/jena-users/201712.mbox/%3cCAE35VmyGk-biQ1Fayp3zoOyiMikZhvjA8dTuGEd3JaYR98uYOA@mail.gmail.com%3e
>      >  >
>      >  > Not sure how to proceed.
>      >  >
>      
> 
> 
> 
> 

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Rob Vesse <rv...@dotnetrdf.org>.
Martynas

I don't see what bearing JAX-RS has on the HttpClient part of this discussion

Jena needs something to manage the HTTP connections regardless of how it reads and writes data over those connections.  A lot of users have use cases that require authenticating themselves to their remote services and provide direct access to the HTTP client gives users the ability to implement whatever authentication mechanism they need for their environment.  It also lets user choose how they do connection pooling, max connection limits, custom SSL setups etc.

Rob

On 27/03/2020, 10:46, "Martynas Jusevičius" <ma...@atomgraph.com> wrote:

    Andy,
    
    if I don't exclude Jena's HttpClient, it's going to clash with the
    HttpClient that Jersey includes, as the versions are almost certainly
    different. That's why I did it in the first place.
    
    What do I do then?
    
    I had proposed multiple times that Jena adopts JAX-RS
    MessageBodyReader/Writer approach:
    https://docs.oracle.com/javaee/7/api/javax/ws/rs/ext/MessageBodyReader.html
    https://docs.oracle.com/javaee/7/api/javax/ws/rs/ext/MessageBodyWriter.html
    
    They cleanly separate the I/O for different media types.
    
    On Fri, Mar 27, 2020 at 11:37 AM Andy Seaborne <an...@apache.org> wrote:
    >
    > 3.0.1 was a long time ago!
    >
    > All the parsing code goes through RDFParser whichever API route you try.
    >
    > RDFParser.fromString(validRDFPost)
    >     .base("http://base")
    >     .lang(RDFPOST)
    >     .parse(parsed);
    >
    > The code needs to read URLs and provide control of the HTTP connection
    > to be used so the user can setup whatever HTTP interaction they need.
    >
    > That's HttpClient - the built-in networking isn't very easy to provide
    > per-connection controls.
    >
    > HttpClient is null.
    > Don't exclude HttpClient.
    > There isn't a common abstraction acorss HTTP libraries.
    >
    >
    > I have had a go at switching all network interaction to using
    > java.net.http (Java11-onwards). As Jena currently uses HttPClient v4,
    > and the v5 interfaces are quite different change was eventually coming
    > anyway.
    >
    > But obv - Java11.
    >
    > https://github.com/afs/jena-http
    >
    >      Andy
    >
    >
    > On 26/03/2020 23:11, Martynas Jusevičius wrote:
    >  > Thanks.
    >  >
    >  > I'm trying 3.14.0 for now. Refactored RDFPostReader without much
    >  > trouble, but now I get
    >  >
    >  > java.lang.NoClassDefFoundError: org/apache/http/client/HttpClient
    >  > at org.apache.jena.riot.RDFParser.create(RDFParser.java:114)
    >  >
    >  > because RDFParser now has a field of type HttpClient :/ I guess it
    >  > didn't in 3.0.1.
    >  >
    >  > I had excluded HTTP Client because Core is using Jersey Client:
    >  >
    >  >          <dependency>
    >  >              <groupId>org.apache.jena</groupId>
    >  >              <artifactId>jena-arq</artifactId>
    >  >              <version>3.14.0</version>
    >  >              <exclusions>
    >  >                  <exclusion>
    >  >                      <groupId>org.apache.httpcomponents</groupId>
    >  >                      <artifactId>httpclient</artifactId>
    >  >                  </exclusion>
    >  >                  <exclusion>
    >  >                      <groupId>org.apache.httpcomponents</groupId>
    >  >                      <artifactId>httpclient-cache</artifactId>
    >  >                  </exclusion>
    >  >              </exclusions>
    >  >          </dependency>
    >  >
    >  > This makes me wonder why the parsers/writers are dealing directly with
    >  > HTTP. I already had this problem with the JsonLDWriter:
    >  >
    > https://mail-archives.apache.org/mod_mbox/jena-users/201712.mbox/%3cCAE35VmyGk-biQ1Fayp3zoOyiMikZhvjA8dTuGEd3JaYR98uYOA@mail.gmail.com%3e
    >  >
    >  > Not sure how to proceed.
    >  >
    





Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Martynas Jusevičius <ma...@atomgraph.com>.
Andy,

if I don't exclude Jena's HttpClient, it's going to clash with the
HttpClient that Jersey includes, as the versions are almost certainly
different. That's why I did it in the first place.

What do I do then?

I had proposed multiple times that Jena adopts JAX-RS
MessageBodyReader/Writer approach:
https://docs.oracle.com/javaee/7/api/javax/ws/rs/ext/MessageBodyReader.html
https://docs.oracle.com/javaee/7/api/javax/ws/rs/ext/MessageBodyWriter.html

They cleanly separate the I/O for different media types.

On Fri, Mar 27, 2020 at 11:37 AM Andy Seaborne <an...@apache.org> wrote:
>
> 3.0.1 was a long time ago!
>
> All the parsing code goes through RDFParser whichever API route you try.
>
> RDFParser.fromString(validRDFPost)
>     .base("http://base")
>     .lang(RDFPOST)
>     .parse(parsed);
>
> The code needs to read URLs and provide control of the HTTP connection
> to be used so the user can setup whatever HTTP interaction they need.
>
> That's HttpClient - the built-in networking isn't very easy to provide
> per-connection controls.
>
> HttpClient is null.
> Don't exclude HttpClient.
> There isn't a common abstraction acorss HTTP libraries.
>
>
> I have had a go at switching all network interaction to using
> java.net.http (Java11-onwards). As Jena currently uses HttPClient v4,
> and the v5 interfaces are quite different change was eventually coming
> anyway.
>
> But obv - Java11.
>
> https://github.com/afs/jena-http
>
>      Andy
>
>
> On 26/03/2020 23:11, Martynas Jusevičius wrote:
>  > Thanks.
>  >
>  > I'm trying 3.14.0 for now. Refactored RDFPostReader without much
>  > trouble, but now I get
>  >
>  > java.lang.NoClassDefFoundError: org/apache/http/client/HttpClient
>  > at org.apache.jena.riot.RDFParser.create(RDFParser.java:114)
>  >
>  > because RDFParser now has a field of type HttpClient :/ I guess it
>  > didn't in 3.0.1.
>  >
>  > I had excluded HTTP Client because Core is using Jersey Client:
>  >
>  >          <dependency>
>  >              <groupId>org.apache.jena</groupId>
>  >              <artifactId>jena-arq</artifactId>
>  >              <version>3.14.0</version>
>  >              <exclusions>
>  >                  <exclusion>
>  >                      <groupId>org.apache.httpcomponents</groupId>
>  >                      <artifactId>httpclient</artifactId>
>  >                  </exclusion>
>  >                  <exclusion>
>  >                      <groupId>org.apache.httpcomponents</groupId>
>  >                      <artifactId>httpclient-cache</artifactId>
>  >                  </exclusion>
>  >              </exclusions>
>  >          </dependency>
>  >
>  > This makes me wonder why the parsers/writers are dealing directly with
>  > HTTP. I already had this problem with the JsonLDWriter:
>  >
> https://mail-archives.apache.org/mod_mbox/jena-users/201712.mbox/%3cCAE35VmyGk-biQ1Fayp3zoOyiMikZhvjA8dTuGEd3JaYR98uYOA@mail.gmail.com%3e
>  >
>  > Not sure how to proceed.
>  >

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Andy Seaborne <an...@apache.org>.
3.0.1 was a long time ago!

All the parsing code goes through RDFParser whichever API route you try.

RDFParser.fromString(validRDFPost)
    .base("http://base")
    .lang(RDFPOST)
    .parse(parsed);

The code needs to read URLs and provide control of the HTTP connection 
to be used so the user can setup whatever HTTP interaction they need.

That's HttpClient - the built-in networking isn't very easy to provide 
per-connection controls.

HttpClient is null.
Don't exclude HttpClient.
There isn't a common abstraction acorss HTTP libraries.


I have had a go at switching all network interaction to using 
java.net.http (Java11-onwards). As Jena currently uses HttPClient v4, 
and the v5 interfaces are quite different change was eventually coming 
anyway.

But obv - Java11.

https://github.com/afs/jena-http

     Andy


On 26/03/2020 23:11, Martynas Jusevičius wrote:
 > Thanks.
 >
 > I'm trying 3.14.0 for now. Refactored RDFPostReader without much
 > trouble, but now I get
 >
 > java.lang.NoClassDefFoundError: org/apache/http/client/HttpClient
 > at org.apache.jena.riot.RDFParser.create(RDFParser.java:114)
 >
 > because RDFParser now has a field of type HttpClient :/ I guess it
 > didn't in 3.0.1.
 >
 > I had excluded HTTP Client because Core is using Jersey Client:
 >
 >          <dependency>
 >              <groupId>org.apache.jena</groupId>
 >              <artifactId>jena-arq</artifactId>
 >              <version>3.14.0</version>
 >              <exclusions>
 >                  <exclusion>
 >                      <groupId>org.apache.httpcomponents</groupId>
 >                      <artifactId>httpclient</artifactId>
 >                  </exclusion>
 >                  <exclusion>
 >                      <groupId>org.apache.httpcomponents</groupId>
 >                      <artifactId>httpclient-cache</artifactId>
 >                  </exclusion>
 >              </exclusions>
 >          </dependency>
 >
 > This makes me wonder why the parsers/writers are dealing directly with
 > HTTP. I already had this problem with the JsonLDWriter:
 > 
https://mail-archives.apache.org/mod_mbox/jena-users/201712.mbox/%3cCAE35VmyGk-biQ1Fayp3zoOyiMikZhvjA8dTuGEd3JaYR98uYOA@mail.gmail.com%3e
 >
 > Not sure how to proceed.
 >

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Martynas Jusevičius <ma...@atomgraph.com>.
Oh right, this is the code I tried - got the exception both ways:

        Model parsed = ModelFactory.createDefaultModel();
        try (StringReader reader = new StringReader(validRDFPost))
        {
            RDFDataMgr.read(parsed, reader, "http://base",
RDFLanguages.RDFPOST);
        }


        Model parsed = ModelFactory.createDefaultModel();
        try (StringReader reader = new StringReader(rdfPost))
        {
            parsed.read(reader, "http://base", "RDF/POST");
        }

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Martynas Jusevičius <ma...@atomgraph.com>.
Thanks.

I'm trying 3.14.0 for now. Refactored RDFPostReader without much
trouble, but now I get

java.lang.NoClassDefFoundError: org/apache/http/client/HttpClient
at org.apache.jena.riot.RDFParser.create(RDFParser.java:114)

because RDFParser now has a field of type HttpClient :/ I guess it
didn't in 3.0.1.

I had excluded HTTP Client because Core is using Jersey Client:

        <dependency>
            <groupId>org.apache.jena</groupId>
            <artifactId>jena-arq</artifactId>
            <version>3.14.0</version>
            <exclusions>
                <exclusion>
                    <groupId>org.apache.httpcomponents</groupId>
                    <artifactId>httpclient</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.apache.httpcomponents</groupId>
                    <artifactId>httpclient-cache</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

This makes me wonder why the parsers/writers are dealing directly with
HTTP. I already had this problem with the JsonLDWriter:
https://mail-archives.apache.org/mod_mbox/jena-users/201712.mbox/%3cCAE35VmyGk-biQ1Fayp3zoOyiMikZhvjA8dTuGEd3JaYR98uYOA@mail.gmail.com%3e

Not sure how to proceed.

On Thu, Mar 26, 2020 at 10:34 PM Andy Seaborne <an...@apache.org> wrote:
>
>
>
> On 26/03/2020 18:57, Martynas Jusevičius wrote:
> > Need to match SPIN RDF API which is on 3.13.1...
> > https://github.com/spinrdf/spinrdf/blob/master/pom.xml#L73
>
> spinrdf does not have any binary artifacts and is built locally.
>
> Changing the version before building would be possible.
>
>      Andy
>
> >
> > On Thu, 26 Mar 2020 at 17.32, Andy Seaborne <an...@apache.org> wrote:
> >
> >> After 3.14.0, use of "IRI" got wrapped up to limit the places it is used
> >> directly.
> >>
> >> Why 3.13.1 and not 3.14.0?
> >> Or 3.15.0-SNAPSHOT because of JENA-1838.
> >>
> >>
> >> On 26/03/2020 13:36, Martynas Jusevičius wrote:
> >>> Hi,
> >>>
> >>> I'm working on a long overdue upgrade of Jena.
> >>>
> >>> So far the area where I can see most changes will be needed is the
> >>> implementation of ReaderRIOT streaming parser for RDF/POST:
> >>>
> >> https://github.com/AtomGraph/Core/blob/master/src/main/java/com/atomgraph/core/riot/lang/RDFPostReader.java
> >>>
> >>> Is LangEngine the recommended base class for such parsers these days?
> >>>
> >> https://github.com/apache/jena/blob/master/jena-arq/src/main/java/org/apache/jena/riot/lang/LangEngine.java
> >>
> >> Or LangBase
> >>
> >>>
> >>> Currently it's extending ReaderRIOTBase.
> >>
> >> ReaderRIOT is API used by RDFParser.
> >> You can implement that (and the companion factory) directly if you want.
> >>
> >> LangBase etc are implementation helpers.
> >>
> >>> Also can't figure out what to replace
> >>> ParserProfile.getPrologue().setBaseURI() calls with. I can see the
> >>> latest LangTurtleBase uses ParserProfile.setBaseIRI(), but I can't
> >>> find such method in 3.13.1.
> >>
> >> 3.15.0-dev:
> >> ParserProfile::setBaseIRI(String)
> >>
> >>>
> >>> Thanks.
> >>>
> >>> Martynas
> >>>
> >>
> >

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Andy Seaborne <an...@apache.org>.

On 26/03/2020 18:57, Martynas Jusevičius wrote:
> Need to match SPIN RDF API which is on 3.13.1...
> https://github.com/spinrdf/spinrdf/blob/master/pom.xml#L73

spinrdf does not have any binary artifacts and is built locally.

Changing the version before building would be possible.

     Andy

> 
> On Thu, 26 Mar 2020 at 17.32, Andy Seaborne <an...@apache.org> wrote:
> 
>> After 3.14.0, use of "IRI" got wrapped up to limit the places it is used
>> directly.
>>
>> Why 3.13.1 and not 3.14.0?
>> Or 3.15.0-SNAPSHOT because of JENA-1838.
>>
>>
>> On 26/03/2020 13:36, Martynas Jusevičius wrote:
>>> Hi,
>>>
>>> I'm working on a long overdue upgrade of Jena.
>>>
>>> So far the area where I can see most changes will be needed is the
>>> implementation of ReaderRIOT streaming parser for RDF/POST:
>>>
>> https://github.com/AtomGraph/Core/blob/master/src/main/java/com/atomgraph/core/riot/lang/RDFPostReader.java
>>>
>>> Is LangEngine the recommended base class for such parsers these days?
>>>
>> https://github.com/apache/jena/blob/master/jena-arq/src/main/java/org/apache/jena/riot/lang/LangEngine.java
>>
>> Or LangBase
>>
>>>
>>> Currently it's extending ReaderRIOTBase.
>>
>> ReaderRIOT is API used by RDFParser.
>> You can implement that (and the companion factory) directly if you want.
>>
>> LangBase etc are implementation helpers.
>>
>>> Also can't figure out what to replace
>>> ParserProfile.getPrologue().setBaseURI() calls with. I can see the
>>> latest LangTurtleBase uses ParserProfile.setBaseIRI(), but I can't
>>> find such method in 3.13.1.
>>
>> 3.15.0-dev:
>> ParserProfile::setBaseIRI(String)
>>
>>>
>>> Thanks.
>>>
>>> Martynas
>>>
>>
> 

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Martynas Jusevičius <ma...@atomgraph.com>.
Need to match SPIN RDF API which is on 3.13.1...
https://github.com/spinrdf/spinrdf/blob/master/pom.xml#L73

On Thu, 26 Mar 2020 at 17.32, Andy Seaborne <an...@apache.org> wrote:

> After 3.14.0, use of "IRI" got wrapped up to limit the places it is used
> directly.
>
> Why 3.13.1 and not 3.14.0?
> Or 3.15.0-SNAPSHOT because of JENA-1838.
>
>
> On 26/03/2020 13:36, Martynas Jusevičius wrote:
> > Hi,
> >
> > I'm working on a long overdue upgrade of Jena.
> >
> > So far the area where I can see most changes will be needed is the
> > implementation of ReaderRIOT streaming parser for RDF/POST:
> >
> https://github.com/AtomGraph/Core/blob/master/src/main/java/com/atomgraph/core/riot/lang/RDFPostReader.java
> >
> > Is LangEngine the recommended base class for such parsers these days?
> >
> https://github.com/apache/jena/blob/master/jena-arq/src/main/java/org/apache/jena/riot/lang/LangEngine.java
>
> Or LangBase
>
> >
> > Currently it's extending ReaderRIOTBase.
>
> ReaderRIOT is API used by RDFParser.
> You can implement that (and the companion factory) directly if you want.
>
> LangBase etc are implementation helpers.
>
> > Also can't figure out what to replace
> > ParserProfile.getPrologue().setBaseURI() calls with. I can see the
> > latest LangTurtleBase uses ParserProfile.setBaseIRI(), but I can't
> > find such method in 3.13.1.
>
> 3.15.0-dev:
> ParserProfile::setBaseIRI(String)
>
> >
> > Thanks.
> >
> > Martynas
> >
>

Re: Upgrading 3.0.1 to 3.13.1: ReaderRIOT

Posted by Andy Seaborne <an...@apache.org>.
After 3.14.0, use of "IRI" got wrapped up to limit the places it is used 
directly.

Why 3.13.1 and not 3.14.0?
Or 3.15.0-SNAPSHOT because of JENA-1838.


On 26/03/2020 13:36, Martynas Jusevičius wrote:
> Hi,
> 
> I'm working on a long overdue upgrade of Jena.
> 
> So far the area where I can see most changes will be needed is the
> implementation of ReaderRIOT streaming parser for RDF/POST:
> https://github.com/AtomGraph/Core/blob/master/src/main/java/com/atomgraph/core/riot/lang/RDFPostReader.java
> 
> Is LangEngine the recommended base class for such parsers these days?
> https://github.com/apache/jena/blob/master/jena-arq/src/main/java/org/apache/jena/riot/lang/LangEngine.java

Or LangBase

> 
> Currently it's extending ReaderRIOTBase.

ReaderRIOT is API used by RDFParser.
You can implement that (and the companion factory) directly if you want.

LangBase etc are implementation helpers.

> Also can't figure out what to replace
> ParserProfile.getPrologue().setBaseURI() calls with. I can see the
> latest LangTurtleBase uses ParserProfile.setBaseIRI(), but I can't
> find such method in 3.13.1.

3.15.0-dev:
ParserProfile::setBaseIRI(String)

> 
> Thanks.
> 
> Martynas
>