You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Nicolas Nobelis <ni...@gmail.com> on 2012/02/01 07:38:32 UTC

Individual with 'percent' in its URI

 Hello,

First of all, sorry if this question has already been asked. I did a search
and
some old result pop up
(http://tech.groups.yahoo.com/group/jena-dev/message/1480), but this is
still
not clear for me.

1/ First question :
I would like to create an individual with a space is its name :
"room_Hamburg_Am Hauptbahnhof 3_07.OG_NIC7"

To do so, I encode the name with URIRef and I get something like :
"room_Hamburg_Am%20Hauptbahnhof%203_07.OG_NIC7"

If this local name legal ?

Now, if I create the individual :
String namespace = "http://example.com/location.owl#"
OntModel m_model = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM);
Individual i = m_model.createIndividual(namespace +
"room_Hamburg_Am%20Hauptbahnhof%203_07.OG_NIC7", aClass)
and I pring the local name :
println(i.getLocalName())

I then get :
_07.OG_NIC7

Obviously there is something wrong here, because I expect
"room_Hamburg_Am%20Hauptbahnhof%203_07.OG_NIC7", right ?


2/ Other question :

I there a magic method in the Jena API that, given a String, transform it
to a
proper localName ? (sanitize it)

I thought that URIRef.encode was doing this job, but it doesn't correct some
illegal local names, such as the ones beginning with numbers (an entity
with URI
"http://example.com/location.owl#123" is illegal according the specs
(http://www.w3.org/TR/xml-names/#NT-NCName))

I tried also IRIFactory, but it seems to only report errors, no correct
them by
itself.

Thanks a lot by advance for any answer,

Best regards

N. Nobelis

Re: Individual with 'percent' in its URI

Posted by Andy Seaborne <an...@apache.org>.
On 01/02/12 11:47, Nicolas Nobelis wrote:
> Hello Andy,
>
> Thanks a lot for this thorough explanation !
>
> I didn't know the local name was supposed to be always legal for XML.
>
> Because I thought that the local name was always the part of the URI
> after the '#' (the fragment), my application was using heavily the
> getLocalName() method.
> Since I'm facing now odd behaviour due to individuals with encoded
> spaces in the URI, I'll follow your advice and store the encoded name
> anyway. Then I'll just simply write my own method to get the fragment
> (non-namespace) part of the URI.

That's a good policy.

> Thanks again for the explanation and for all the refs !
>
> With RDF 1.1, an individual with a local name containing %20 or \: will
> still be able to be written to rdf/xml ?

IIRC An individual is always written as a URI, or relative URI, in RDF/XML.

Classes (striped syntax) and properties can be qnames.

You won't get unwriteable individuals - it just may not look nice.

	Andy

Re: Individual with 'percent' in its URI

Posted by Nicolas Nobelis <ni...@nobelis.eu>.
Hello Andy,

Thanks a lot for this thorough explanation !

I didn't know the local name was supposed to be always legal for XML.

Because I thought that the local name was always the part of the URI 
after the '#' (the fragment), my application was using heavily the 
getLocalName() method.
Since I'm facing now odd behaviour due to individuals with encoded 
spaces in the URI, I'll follow your advice and store the encoded name 
anyway. Then I'll just simply write my own method to get the  fragment 
(non-namespace) part of the URI.

Thanks again for the explanation and for all the refs !

With RDF 1.1, an individual with a local name containing %20 or \: will 
still be able to be written to rdf/xml ?

Best Regards

Nicolas Nobelis

Le 01/02/2012 11:41, Andy Seaborne a écrit :
> Hi Nicolas,
>
> On 01/02/12 06:38, Nicolas Nobelis wrote:
>> Hello,
>>
>> First of all, sorry if this question has already been asked. I did a
>> search and some old result pop up
>> (http://tech.groups.yahoo.com/group/jena-dev/message/1480), but this
>> is still not clear for me.
>>
>> 1/ First question : I would like to create an individual with a space
>> is its name : "room_Hamburg_Am Hauptbahnhof 3_07.OG_NIC7"
>>
>> To do so, I encode the name with URIRef and I get something like :
>> "room_Hamburg_Am%20Hauptbahnhof%203_07.OG_NIC7"
>
> Yes - you have to do this.
>
> Strictly, space is allowed in an "RDF URI Reference" but none of the
> syntaxes can write it down safely because they use "URIs". i.e. you
> can't use spaces in practice.
>
> RDF 1.1 will remove phrase "RDF URI Reference" and say "IRI" and there
> you can't use raw space.
>
> RDF-2004 tied to align with RFC 3987, but published before the RFC,
> which changed it's mind on spaces.
>
>>
>> If this local name legal ?
>
> It is a legal part of a URI.
>
> A "local name" is a syntactic device for writing down URIs.
>
> If a URI can't be written as prefixed name (i.e Turtle) it will be
> written in long <...> form.
>
> RDF.XML is a bit different - properties can only be written as qnames so
> it must be possible to have namespace+local name.
>
>>
>> Now, if I create the individual :
>  > String namespace = "http://example.com/location.owl#"
>  > OntModel m_model =
>> ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM);
>  > Individual i = m_model.createIndividual(namespace +
>> "room_Hamburg_Am%20Hauptbahnhof%203_07.OG_NIC7", aClass)
>
> So you are passing the full URI to Jena - your code did the + beforfe
> the call to createIndividual.
>
>  >
> and I pring
>> the local name : println(i.getLocalName())
>>
>> I then get : _07.OG_NIC7
>
> Internl, Jen astore the URI. If you ask for the local name, it
> calculates the longest part that is still legal for XML.
>
> In this case, "_07.OG_NIC7" is the best it can do. The
>
>>
>> Obviously there is something wrong here, because I expect
>> "room_Hamburg_Am%20Hauptbahnhof%203_07.OG_NIC7", right ?
>
> Afraid not - "_07.OG_NIC7" is the best the system can do.
>
> Not local names don't really matter - they are just surface syntax.
>
>> 2/ Other question :
>>
>> I there a magic method in the Jena API that, given a String,
>> transform it to a proper localName ? (sanitize it)
>
> Yes - you called it :-) I'm afraid the rules of local names aren't
> always what you want as they cater for RDF/XML.
>
> RDF 1.1 is going to change this for the better.
>
> 1/ The local part of a Turtle prefix name will allow %20
> 2/ Turtle prefix names will be able to start with a number (XML qnames
> can't).
> 3/ Other interesting chars can be escaped into a local name. e.g. ":" as \:
>
> This is not implemented in Jena yet - the details have not been
> finalized by the RDF working group.
>
>> I thought that URIRef.encode was doing this job, but it doesn't
>> correct some illegal local names, such as the ones beginning with
>> numbers (an entity with URI "http://example.com/location.owl#123" is
>> illegal according the specs
>> (http://www.w3.org/TR/xml-names/#NT-NCName))
>
> Yes - XML rules are pretty restrictive.
>
>> I tried also IRIFactory, but it seems to only report errors, no
>> correct them by itself.
>>
>> Thanks a lot by advance for any answer,
>>
>> Best regards
>>
>> N. Nobelis
>>
>
> Summary: put this into the data:
>
> http://example.com/location.owl#room_Hamburg_Am%20Hauptbahnhof%203_07.OG_NIC7
>
>
> and don't worry too much about local names.
>
> Hope that helps,
>
> Andy

Re: Individual with 'percent' in its URI

Posted by Andy Seaborne <an...@apache.org>.
Hi Nicolas,

On 01/02/12 06:38, Nicolas Nobelis wrote:
> Hello,
>
> First of all, sorry if this question has already been asked. I did a
> search and some old result pop up
> (http://tech.groups.yahoo.com/group/jena-dev/message/1480), but this
> is still not clear for me.
>
> 1/ First question : I would like to create an individual with a space
> is its name : "room_Hamburg_Am Hauptbahnhof 3_07.OG_NIC7"
>
> To do so, I encode the name with URIRef and I get something like :
> "room_Hamburg_Am%20Hauptbahnhof%203_07.OG_NIC7"

Yes - you have to do this.

Strictly, space is allowed in an "RDF URI Reference" but none of the 
syntaxes can write it down safely because they use "URIs".  i.e. you 
can't use spaces in practice.

RDF 1.1 will remove phrase "RDF URI Reference" and say "IRI" and there 
you can't use raw space.

RDF-2004 tied to align with RFC 3987, but published before the RFC, 
which changed it's mind on spaces.

>
> If this local name legal ?

It is a legal part of a URI.

A "local name" is a syntactic device for writing down URIs.

If a URI can't be written as prefixed name (i.e Turtle) it will be 
written in long <...> form.

RDF.XML is a bit different - properties can only be written as qnames 
so it must be possible to have namespace+local name.

>
> Now, if I create the individual :
 >  String namespace = "http://example.com/location.owl#"
 > OntModel m_model =
> ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM);
 > Individual i = m_model.createIndividual(namespace +
> "room_Hamburg_Am%20Hauptbahnhof%203_07.OG_NIC7", aClass)

So you are passing the full URI to Jena - your code did the + beforfe 
the call to createIndividual.

 >
and I pring
> the local name : println(i.getLocalName())
>
> I then get : _07.OG_NIC7

Internl, Jen astore the URI.  If you ask for the local name, it 
calculates the longest part that is still legal for XML.

In this case, "_07.OG_NIC7" is the best it can do.  The

>
> Obviously there is something wrong here, because I expect
> "room_Hamburg_Am%20Hauptbahnhof%203_07.OG_NIC7", right ?

Afraid not - "_07.OG_NIC7" is the best the system can do.

Not local names don't really matter - they are just surface syntax.

> 2/ Other question :
>
> I there a magic method in the Jena API that, given a String,
> transform it to a proper localName ? (sanitize it)

Yes - you called it :-) I'm afraid the rules of local names aren't 
always what you want as they cater for RDF/XML.

RDF 1.1 is going to change this for the better.

1/ The local part of a Turtle prefix name will allow %20
2/ Turtle prefix names will be able to start with a number (XML qnames 
can't).
3/ Other interesting chars can be escaped into a local name. e.g. ":" as \:

This is not implemented in Jena yet - the details have not been 
finalized by the RDF working group.

> I thought that URIRef.encode was doing this job, but it doesn't
> correct some illegal local names, such as the ones beginning with
> numbers (an entity with URI "http://example.com/location.owl#123" is
> illegal according the specs
> (http://www.w3.org/TR/xml-names/#NT-NCName))

Yes - XML rules are pretty restrictive.

> I tried also IRIFactory, but it seems to only report errors, no
> correct them by itself.
>
> Thanks a lot by advance for any answer,
>
> Best regards
>
> N. Nobelis
>

Summary: put this into the data:

http://example.com/location.owl#room_Hamburg_Am%20Hauptbahnhof%203_07.OG_NIC7

and don't worry too much about local names.

Hope that helps,

	Andy