You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Glenn Proctor <gl...@eaglegenomics.com> on 2012/02/27 14:23:18 UTC

400 Unknown error from Fuseki when trying to load nq and n3 files

Hi

I am trying to load a large (500Mb) file of n-quads into an instance
of Fuseki. The command I am using is

s-put http://localhost:3030/dataset/data hgnc ~/Desktop/hgnc.nq

This fails with the following error:

400 Unknown: text/n-quads;charset=ascii

I'm assuming the 400 here is the HTTP status code for "bad request".

I get the same error whether I use a memory-backed or TDB-backed
Fuseki instance. I've tried a concatenated version of the file in
question and I get the same behaviour.

The file in question is the uncompressed version of
http://download.bio2rdf.org/data/hgnc/hgnc.nq.gz

Another file, in n3 format, gives a very similar error. I'm sure
there's something simple I'm missing, and I'd be grateful for any
pointers.

Regards

Glenn.

Re: 400 Unknown error from Fuseki when trying to load nq and n3 files

Posted by Andy Seaborne <an...@apache.org>.
> Glenn Proctor wrote:

 > s-put http://localhost:3030/dataset/data hgnc ~/Desktop/hgnc.nq
 >
 > This fails with the following error:
 > 400 Unknown: text/n-quads;charset=ascii

Your PUTing a file to a named graph "hgnc" in dataset "dataset".

Put you are PUTing N-Quads, which is multigraph (even if the data is all 
for the default graph i.e. triples - the system does not know when it's 
deciding the parser to use).

You can't send quads into a graph.

If it's truly N-Triples, then use file extensions ".nt" or use curl/wget 
and set the content type to "text/plain" or (IMHO better) 
"application/n-triples".  Or "text/turtle".

If you want to load quads into a dataset, you may be better off doing a 
bulk loader operation offline and then publishing the database.

	Andy

Re: 400 Unknown error from Fuseki when trying to load nq and n3 files

Posted by Glenn Proctor <gl...@eaglegenomics.com>.
Thanks Paulo - I know that tdbloader2 can handle n3 files, however the
particular files I was using had some issues with malformed URIs, as
well as the ------------------- lines you spotted, so the conversion
step helped clean these up. This step wouldn't have been necessary if
the files had been properly formatted in the first place ...

Regards

Glenn.


On Tue, Feb 28, 2012 at 4:01 PM, Paolo Castagna
<ca...@googlemail.com> wrote:
> Glenn Proctor wrote:
>> Hi folks
>>
>> Thanks for the helpful replies. In the end I used rapper to convert
>> the n3/nq files to rdf/xml, and then tdbloader2 to bulk load the
>> resulting files into TDB. As Andy suggested this was much quicker than
>> doing everything via Fuseki.
>
> You can load N-Triples | N-Quads with tdbloader|tdbloader2,
> that should even be faster.
>
> Paolo
>
>>
>> I've now started a Fuseki server on top of the TDB I created and it's
>> working very well.
>>
>> Thanks for the help
>>
>> Glenn.
>>
>>
>> On Mon, Feb 27, 2012 at 3:47 PM, Paolo Castagna
>> <ca...@googlemail.com> wrote:
>>> Paolo Castagna wrote:
>>>> Next step (mine or your) is to check in the Fuseki source code if the
>>>> PUT handles other RDF serializations (and if not, this could be a good
>>>> candidate to open a new feature request).
>>>>
>>>> I found the parseBody method in Fuseki, but I'll look in details later,
>>>> here it is, just in case another pair of eyes is faster than mine:
>>>> http://svn.apache.org/repos/asf/incubator/jena/Jena2/Fuseki/trunk/src/main/java/org/apache/jena/fuseki/servlets/SPARQL_REST.java
>>> After having seen Andy's reply... oh, yes!
>>>
>>> No problem in Fuseki, this also works:
>>> curl -X PUT -H "Content-Type: application/n-triples" -d@/tmp/hgnc-100.nt
>>> http://localhost:3030/dataset/data?default
>>>
>>> Andy, do we have a problem in soh [1], line 47?
>>> $fileMediaTypes['n3']    = 'text/rdf+n3application/rdf+n3'
>>>
>>> I am not sure which one is the correct one.
>>>
>>> Paolo
>>>
>>>  [1] http://svn.apache.org/repos/asf/incubator/jena/Jena2/Fuseki/trunk/soh
>

Re: 400 Unknown error from Fuseki when trying to load nq and n3 files

Posted by Paolo Castagna <ca...@googlemail.com>.
Glenn Proctor wrote:
> Hi folks
> 
> Thanks for the helpful replies. In the end I used rapper to convert
> the n3/nq files to rdf/xml, and then tdbloader2 to bulk load the
> resulting files into TDB. As Andy suggested this was much quicker than
> doing everything via Fuseki.

You can load N-Triples | N-Quads with tdbloader|tdbloader2,
that should even be faster.

Paolo

> 
> I've now started a Fuseki server on top of the TDB I created and it's
> working very well.
> 
> Thanks for the help
> 
> Glenn.
> 
> 
> On Mon, Feb 27, 2012 at 3:47 PM, Paolo Castagna
> <ca...@googlemail.com> wrote:
>> Paolo Castagna wrote:
>>> Next step (mine or your) is to check in the Fuseki source code if the
>>> PUT handles other RDF serializations (and if not, this could be a good
>>> candidate to open a new feature request).
>>>
>>> I found the parseBody method in Fuseki, but I'll look in details later,
>>> here it is, just in case another pair of eyes is faster than mine:
>>> http://svn.apache.org/repos/asf/incubator/jena/Jena2/Fuseki/trunk/src/main/java/org/apache/jena/fuseki/servlets/SPARQL_REST.java
>> After having seen Andy's reply... oh, yes!
>>
>> No problem in Fuseki, this also works:
>> curl -X PUT -H "Content-Type: application/n-triples" -d@/tmp/hgnc-100.nt
>> http://localhost:3030/dataset/data?default
>>
>> Andy, do we have a problem in soh [1], line 47?
>> $fileMediaTypes['n3']    = 'text/rdf+n3application/rdf+n3'
>>
>> I am not sure which one is the correct one.
>>
>> Paolo
>>
>>  [1] http://svn.apache.org/repos/asf/incubator/jena/Jena2/Fuseki/trunk/soh


Re: 400 Unknown error from Fuseki when trying to load nq and n3 files

Posted by Glenn Proctor <gl...@eaglegenomics.com>.
Hi folks

Thanks for the helpful replies. In the end I used rapper to convert
the n3/nq files to rdf/xml, and then tdbloader2 to bulk load the
resulting files into TDB. As Andy suggested this was much quicker than
doing everything via Fuseki.

I've now started a Fuseki server on top of the TDB I created and it's
working very well.

Thanks for the help

Glenn.


On Mon, Feb 27, 2012 at 3:47 PM, Paolo Castagna
<ca...@googlemail.com> wrote:
> Paolo Castagna wrote:
>> Next step (mine or your) is to check in the Fuseki source code if the
>> PUT handles other RDF serializations (and if not, this could be a good
>> candidate to open a new feature request).
>>
>> I found the parseBody method in Fuseki, but I'll look in details later,
>> here it is, just in case another pair of eyes is faster than mine:
>> http://svn.apache.org/repos/asf/incubator/jena/Jena2/Fuseki/trunk/src/main/java/org/apache/jena/fuseki/servlets/SPARQL_REST.java
>
> After having seen Andy's reply... oh, yes!
>
> No problem in Fuseki, this also works:
> curl -X PUT -H "Content-Type: application/n-triples" -d@/tmp/hgnc-100.nt
> http://localhost:3030/dataset/data?default
>
> Andy, do we have a problem in soh [1], line 47?
> $fileMediaTypes['n3']    = 'text/rdf+n3application/rdf+n3'
>
> I am not sure which one is the correct one.
>
> Paolo
>
>  [1] http://svn.apache.org/repos/asf/incubator/jena/Jena2/Fuseki/trunk/soh

Re: 400 Unknown error from Fuseki when trying to load nq and n3 files

Posted by Paolo Castagna <ca...@googlemail.com>.
Paolo Castagna wrote:
> Next step (mine or your) is to check in the Fuseki source code if the
> PUT handles other RDF serializations (and if not, this could be a good
> candidate to open a new feature request).
> 
> I found the parseBody method in Fuseki, but I'll look in details later,
> here it is, just in case another pair of eyes is faster than mine:
> http://svn.apache.org/repos/asf/incubator/jena/Jena2/Fuseki/trunk/src/main/java/org/apache/jena/fuseki/servlets/SPARQL_REST.java

After having seen Andy's reply... oh, yes!

No problem in Fuseki, this also works:
curl -X PUT -H "Content-Type: application/n-triples" -d@/tmp/hgnc-100.nt
http://localhost:3030/dataset/data?default

Andy, do we have a problem in soh [1], line 47?
$fileMediaTypes['n3']    = 'text/rdf+n3application/rdf+n3'

I am not sure which one is the correct one.

Paolo

 [1] http://svn.apache.org/repos/asf/incubator/jena/Jena2/Fuseki/trunk/soh

Re: 400 Unknown error from Fuseki when trying to load nq and n3 files

Posted by Paolo Castagna <ca...@googlemail.com>.
Glenn Proctor wrote:
> Hi Paolo
> 
> Thanks for looking into this for me. I've tried using the n3 file from
> the same source, filtering out any --------- lines, and including only
> 100 lines. The test file I'm using is
> 
> http://dl.dropbox.com/u/23033/hgnc-100.n3

Hi Glenn,
ok... this file is correct.

You are using the SPARQL 1.1 Graph Store HTTP Protocol spec to upload
your data.

All the examples in that spec use RDF/XML (i.e. application/rdf+xml),
but this is not a good reason not to support other serializations.
I've also tried to use curl instead on s-put (just to remove another
variable from the table). I have your problem as well.

No problem with:
curl -X PUT -H "Content-Type: application/rdf+xml" -d@/tmp/hgnc-100.rdf
http://localhost:3030/dataset/data?default

Next step (mine or your) is to check in the Fuseki source code if the
PUT handles other RDF serializations (and if not, this could be a good
candidate to open a new feature request).

I found the parseBody method in Fuseki, but I'll look in details later,
here it is, just in case another pair of eyes is faster than mine:
http://svn.apache.org/repos/asf/incubator/jena/Jena2/Fuseki/trunk/src/main/java/org/apache/jena/fuseki/servlets/SPARQL_REST.java

> 
> There are no unusual lines (as far as I can see) and no ^^ prefixes.
> The file linked above validates using
> http://www.rdfabout.com/demo/validator/ and also on the command line
> using the rapper utility from the Raptor library (89 triples in
> total).
> 
> However when I try to start a simple Fuseki instance using
> 
> fuseki-server --update --mem /dataset
> 
> and load in the n3 file using
> 
> s-put http://localhost:3030/dataset/data default ~/Dropbox/Public/hgnc-100.n3
> 
> I get
> 
> 400 Unknown: text/rdf+n3application/rdf+n3

I think this: "text/rdf+n3application/rdf+n3" is also a problem in
the s-put file. Should we have a comma? Or just one?

Paolo

> http://localhost:3030/dataset/data?default
> 
> I can't see what it is about the file that Fuseki doesn't like.
> 
> Glenn.
> 
> On Mon, Feb 27, 2012 at 1:43 PM, Paolo Castagna
> <ca...@googlemail.com> wrote:
>> Glenn Proctor wrote:
>>> The file in question is the uncompressed version of
>>> http://download.bio2rdf.org/data/hgnc/hgnc.nq.gz
>> Hi Glenn,
>> maybe this is not the problem (or maybe it is).
>>
>> I've just noticed that the hgnc.nq.gz file above starts/ends
>> with "--------------------" (i.e. it's not a valid N-Quads
>> file).
>>
>> Paolo


Re: 400 Unknown error from Fuseki when trying to load nq and n3 files

Posted by Glenn Proctor <gl...@eaglegenomics.com>.
Hi Paolo

Thanks for looking into this for me. I've tried using the n3 file from
the same source, filtering out any --------- lines, and including only
100 lines. The test file I'm using is

http://dl.dropbox.com/u/23033/hgnc-100.n3

There are no unusual lines (as far as I can see) and no ^^ prefixes.
The file linked above validates using
http://www.rdfabout.com/demo/validator/ and also on the command line
using the rapper utility from the Raptor library (89 triples in
total).

However when I try to start a simple Fuseki instance using

fuseki-server --update --mem /dataset

and load in the n3 file using

s-put http://localhost:3030/dataset/data default ~/Dropbox/Public/hgnc-100.n3

I get

400 Unknown: text/rdf+n3application/rdf+n3
http://localhost:3030/dataset/data?default

I can't see what it is about the file that Fuseki doesn't like.

Glenn.

On Mon, Feb 27, 2012 at 1:43 PM, Paolo Castagna
<ca...@googlemail.com> wrote:
> Glenn Proctor wrote:
>> The file in question is the uncompressed version of
>> http://download.bio2rdf.org/data/hgnc/hgnc.nq.gz
>
> Hi Glenn,
> maybe this is not the problem (or maybe it is).
>
> I've just noticed that the hgnc.nq.gz file above starts/ends
> with "--------------------" (i.e. it's not a valid N-Quads
> file).
>
> Paolo

Re: 400 Unknown error from Fuseki when trying to load nq and n3 files

Posted by Paolo Castagna <ca...@googlemail.com>.
Glenn Proctor wrote:
> The file in question is the uncompressed version of
> http://download.bio2rdf.org/data/hgnc/hgnc.nq.gz

Hi Glenn,
maybe this is not the problem (or maybe it is).

I've just noticed that the hgnc.nq.gz file above starts/ends
with "--------------------" (i.e. it's not a valid N-Quads
file).

Paolo

Re: 400 Unknown error from Fuseki when trying to load nq and n3 files

Posted by Paolo Castagna <ca...@googlemail.com>.
Glenn Proctor wrote:
> The file in question is the uncompressed version of
> http://download.bio2rdf.org/data/hgnc/hgnc.nq.gz
> 
> Another file, in n3 format, gives a very similar error. I'm sure
> there's something simple I'm missing, and I'd be grateful for any
> pointers.

Even, filtering out the lines with "-------...", you have lines
such as:

<http://bio2rdf.org/hugo:A1BG> <http://bio2rdf.org/hugo_resource:approvedSymbol>
"A1BG"^^xsd:string <http://bio2rdf.org/hgnc_record:5> .

I don't think ^^xsd:string is correct (i.e. N-Quads file do not have
any notion of prefix).

Paolo