You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Stian Soiland-Reyes (JIRA)" <ji...@apache.org> on 2015/01/09 05:31:35 UTC

[jira] [Commented] (JENA-846) Add IRI.toUri

    [ https://issues.apache.org/jira/browse/JENA-846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14270540#comment-14270540 ] 

Stian Soiland-Reyes commented on JENA-846:
------------------------------------------

It looks good. I would also think about the ASCII string (even though java.net.URI kind-of supports IRIs anyway), rather than constructing from the individual components.

As {{iri.toUri()}} would just be a convenience shorthand for  {{URI.create(iri.createASCIIString());}} (or new URI(..)) - would you not expect a RuntimeException instead of URISyntaxException? Adding a try..catch for a convenience shorthand would be a bit odd.. :)

In what circumstances do we expect {{iri.toUri()}} to fail? The choice of exceptions in java.net.URI constructor vs. in {{URI.create()}} is based on how thrustworty the source string is.. so say  

{code}
    u = URI.create("http://hardcoded.example.com/")  // We assume the developer-provided URI is correct)  
{code}

vs.

{code}
try {
  u = new URI(webForm.get("uri")));
} catch (URISyntaxException ex) {
  System.err.println("But it's not an URI!");
}
{code}

in a way - do you expect an exception? :)

In Jena's IRI those user-entry errors shouldn't be an issue as it has already passed through the IRIFactory (I'm assuming there's not a "Let anything through" option on the factory). And so I would expect similarly to only get an {{IllegalStateException}} in the odd case that {{java.net.URI}} can't deal with or is buggy with. 

Here, the only exception is the MalformedIDNException, right - so it's a bit somewhere between an IOException and a syntactical error. Would it really happen here in this direction? I tried to look up the test cases..

A kind of fallback with java.net.IRI would be to construct it with a unicode host (by components) only if MalformedIDNException falls over - afterall such 'broken' hostnames would never cause any problem with java.net.URI.create("http://søiland.no/") - it's just that it's ASCII escaped to http://s%C3%B8iland.no/ instead of punycoded to http://xn--siland-bya.no/  (just don't try to call url.toURL().openConnection() on that..). But then again java.net.URI are not meant for network resolution, just for comparisons and relativization. 


I would also expect the counter-part 'fromUri' or .create(URI) in the `IRIFactoryI` -- I know you could say it's just to do it via the string, but that's the case both ways :). In this case Jena can be good and handle such 'unicody java.net.URI's - this is also where it might want to fall over with a MalformedIDNException?


BTW.. you might like (or not!)  to see there's been a discussion flaring up this week about how RFC3986 might not be quite right about IDNAs (or at least not quite what people implement):  http://www.ietf.org/mail-archive/web/apps-discuss/current/maillist.html 

> Add IRI.toUri 
> --------------
>
>                 Key: JENA-846
>                 URL: https://issues.apache.org/jira/browse/JENA-846
>             Project: Apache Jena
>          Issue Type: New Feature
>          Components: IRI
>    Affects Versions: Jena 2.12.1
>            Reporter: Andy Seaborne
>            Priority: Minor
>         Attachments: iri-uri.patch
>
>
> See discussion from:
> http://mail-archives.apache.org/mod_mbox/jena-users/201501.mbox/%3C54AE8658.3040205%40apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)