You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Martynas Jusevičius <ma...@atomgraph.com> on 2021/10/26 08:40:03 UTC

[3.16.0] Implementing ReaderRIOT

Hi,

I'm implementing a Reader that extracts <script
type="application/ld+json"> from HTML and then uses JsonLDReader to
parse that data.

I've got a few questions in the process:

1. How does one obtain the default ParserProfile and ErrorHandler?
Currently I'm using

    ParserProfile profile =
RiotLib.profile(HtmlJsonLDReaderFactory.HTML, null,
ErrorHandlerFactory.getDefaultErrorHandler());

but internally the method it calls are all deprecated.

2. Can multiple StreamRDF be combined into one? There might be
multiple <script> elements and the reader should merge the data from
them. I attempted

    public void read(Reader in, String baseURI, Lang lang, StreamRDF
output, Context context)
    {
        ...
        for (Element element : jsonLdElements)
        {
            String jsonLd = element.data();
            getJsonLDReader().read(new StringReader(jsonLd), baseURI,
JSONLD.getContentType(), output, context);
        }
    }

but only the first element gets read, I guess because
getJsonLDReader().read() calls output.start()/output.finish().

Full Reader code:
https://github.com/AtomGraph/LinkedDataHub/blob/html-jsonld-reader/src/main/java/com/atomgraph/linkeddatahub/io/HtmlJsonLDReader.java#L73

Martynas

Re: [3.16.0] Implementing ReaderRIOT

Posted by Martynas Jusevičius <ma...@atomgraph.com>.
On Tue, Oct 26, 2021 at 11:52 AM Andy Seaborne <an...@apache.org> wrote:
>
>
> On 26/10/2021 09:40, Martynas Jusevičius wrote:
> > Hi,
> >
> > I'm implementing a Reader that extracts <script
> > type="application/ld+json"> from HTML and then uses JsonLDReader to
> > parse that data.
> >
> > I've got a few questions in the process:
> >
> > 1. How does one obtain the default ParserProfile and ErrorHandler?
> > Currently I'm using
> >
> >      ParserProfile profile =
> > RiotLib.profile(HtmlJsonLDReaderFactory.HTML, null,
> > ErrorHandlerFactory.getDefaultErrorHandler());
>
> Use
>
> RDFParser.create()... parse(stream);
>
> then don't call .factory() .errorHandler() on the RDFParserBuilder and
> the defaults will be used.

I'm doing some hacking because currently I'm stuck with 3.16.0 which
does not support JsonLDReader.JSONLD_OPTIONS.
So I can't propery initialize RDFParser for JSON-LD. For now I've
copied the JsonLDReader class from 3.17.0 which has JSONLD_OPTIONS.

> > but internally the method it calls are all deprecated.
> >
> > 2. Can multiple StreamRDF be combined into one?
>
> A single StreamRDF can be reused unless it documents otherwise.  A graph
> sink can be used multiple times.
>
> > There might be
> > multiple <script> elements and the reader should merge the data from
> > them. I attempted
> >
> >      public void read(Reader in, String baseURI, Lang lang, StreamRDF
> > output, Context context)
> >      {
> >          ...
> >          for (Element element : jsonLdElements)
> >          {
> >              String jsonLd = element.data();
> >              getJsonLDReader().read(new StringReader(jsonLd), baseURI,
> > JSONLD.getContentType(), output, context);
> >          }
> >      }
> >
> > but only the first element gets read,
>
> > I guess because
> > getJsonLDReader().read() calls output.start()/output.finish().
>
> Unlikely - .start/.finish can be called multiple times (unless the
> StreamRDF explicitly documents otherwise -  a graph sink can be used
> multiple times.

Right. It seems to work, I think I looked at the wrong examples when I
wrote this -- sorry about that.

The code can be found here:
https://github.com/AtomGraph/LinkedDataHub/blob/develop/src/main/java/com/atomgraph/linkeddatahub/io/HtmlJsonLDReader.java

Re: [3.16.0] Implementing ReaderRIOT

Posted by Andy Seaborne <an...@apache.org>.
On 26/10/2021 09:40, Martynas Jusevičius wrote:
> Hi,
> 
> I'm implementing a Reader that extracts <script
> type="application/ld+json"> from HTML and then uses JsonLDReader to
> parse that data.
> 
> I've got a few questions in the process:
> 
> 1. How does one obtain the default ParserProfile and ErrorHandler?
> Currently I'm using
> 
>      ParserProfile profile =
> RiotLib.profile(HtmlJsonLDReaderFactory.HTML, null,
> ErrorHandlerFactory.getDefaultErrorHandler());

Use

RDFParser.create()... parse(stream);

then don't call .factory() .errorHandler() on the RDFParserBuilder and 
the defaults will be used.

> 
> but internally the method it calls are all deprecated.
> 
> 2. Can multiple StreamRDF be combined into one?

A single StreamRDF can be reused unless it documents otherwise.  A graph 
sink can be used multiple times.

> There might be
> multiple <script> elements and the reader should merge the data from
> them. I attempted
> 
>      public void read(Reader in, String baseURI, Lang lang, StreamRDF
> output, Context context)
>      {
>          ...
>          for (Element element : jsonLdElements)
>          {
>              String jsonLd = element.data();
>              getJsonLDReader().read(new StringReader(jsonLd), baseURI,
> JSONLD.getContentType(), output, context);
>          }
>      }
> 
> but only the first element gets read,

> I guess because
> getJsonLDReader().read() calls output.start()/output.finish().

Unlikely - .start/.finish can be called multiple times (unless the 
StreamRDF explicitly documents otherwise -  a graph sink can be used 
multiple times.

     Andy


> Full Reader code:
> https://github.com/AtomGraph/LinkedDataHub/blob/html-jsonld-reader/src/main/java/com/atomgraph/linkeddatahub/io/HtmlJsonLDReader.java#L73
> 
> Martynas
>