You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Rob Vesse <rv...@dotnetrdf.org> on 2017/10/17 13:51:40 UTC
Supporting Concatenated Gzip archives
Andy
Would it be worth pulling in commons-compress as a dependency and switching to using their GZip stream implementations that do not have this limitation?
This is a trivial change but It does add an additional dependency
Rob
On 17/10/2017 14:08, "Andy Seaborne" <an...@apache.org> wrote:
In addition to Rob's point about multiple file in one GZ file...
What does the Fuseki log say?
Can you unload the NT file uncompressed?
How are you uploading the nt.gz file?
Andy
On 17/10/17 05:15, Rob Vesse wrote:
> Do you know how the original GZip archive was generated?
>
> Jena uses the standard JDK GZip support to read GZip archives. The JDK doesn’t support the case where multiple separate GZip streams are concatenated into a single file. Therefore, if the archive was created in that way Jena might only read the first stream from the archive and ignore the subsequent streams.
>
> Extracting with rapper probably uses the OS gzip directly or a library implementation of it which does handle this concatentation
>
> Is this a file you could share somehow?
>
> Rob
>
> On 17/10/2017 03:55, "Andrew U. Frank" <fr...@geoinfo.tuwien.ac.at> wrote:
>
> i experience a strange effect (replicated a few times):
> i upload data in nt.gz format and get a success message, but only a part
> (sometimes less than 10%) are uploaded.
> if i extract the nt file from gz.nt and then convert with rapper to
> turtle format, i get an information on how many tripels are in the nt.gz
> file and when i then upload the ttl file all triples are loaded.
> i use the browser upload.
>
> any explanation? i use fuseki 3.4.0.
>
> thank you!
> andrew
>
>
>
>
>
>
Re: Supporting Concatenated Gzip archives
Posted by Andy Seaborne <an...@apache.org>.
Not for 3.5.0 :-)
Actually, I'm not clear that the users Q is clear - maybe it's the HTTP
gzip option which is why I wamted to se the Fuseki log and know how he's
pushing the file(s). I assume it's all his RDF/XML files, converted.
Maybe better to see as "upload collection" and include zip and
tar,tag.gz files? NT can be concatenated, RDF/XML can not.
Andy
On 17/10/17 09:51, Rob Vesse wrote:
> Andy
>
> Would it be worth pulling in commons-compress as a dependency and switching to using their GZip stream implementations that do not have this limitation?
>
> This is a trivial change but It does add an additional dependency
>
> Rob
>
> On 17/10/2017 14:08, "Andy Seaborne" <an...@apache.org> wrote:
>
> In addition to Rob's point about multiple file in one GZ file...
>
> What does the Fuseki log say?
>
> Can you unload the NT file uncompressed?
>
> How are you uploading the nt.gz file?
>
> Andy
>
> On 17/10/17 05:15, Rob Vesse wrote:
> > Do you know how the original GZip archive was generated?
> >
> > Jena uses the standard JDK GZip support to read GZip archives. The JDK doesn’t support the case where multiple separate GZip streams are concatenated into a single file. Therefore, if the archive was created in that way Jena might only read the first stream from the archive and ignore the subsequent streams.
> >
> > Extracting with rapper probably uses the OS gzip directly or a library implementation of it which does handle this concatentation
> >
> > Is this a file you could share somehow?
> >
> > Rob
> >
> > On 17/10/2017 03:55, "Andrew U. Frank" <fr...@geoinfo.tuwien.ac.at> wrote:
> >
> > i experience a strange effect (replicated a few times):
> > i upload data in nt.gz format and get a success message, but only a part
> > (sometimes less than 10%) are uploaded.
> > if i extract the nt file from gz.nt and then convert with rapper to
> > turtle format, i get an information on how many tripels are in the nt.gz
> > file and when i then upload the ttl file all triples are loaded.
> > i use the browser upload.
> >
> > any explanation? i use fuseki 3.4.0.
> >
> > thank you!
> > andrew
> >
> >
> >
> >
> >
> >
>
>
>
>
>