You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Ben Roling (JIRA)" <ji...@apache.org> on 2014/05/10 23:57:57 UTC

[jira] [Commented] (OOZIE-1829) URIHandlerService doesn't support URI schemes with query strings but no path segment

    [ https://issues.apache.org/jira/browse/OOZIE-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993827#comment-13993827 ] 

Ben Roling commented on OOZIE-1829:
-----------------------------------

It appears the service also doesn't fully support URI schemes where there is no authority.  For example, you might have a Kite URI like this:

repo:hive?dataset-name=Person&partition-key=[201405091300]

When getAuthorityWithScheme() is called it would return "/" since it fails to find an authority in the URI.  It looks to me like this will cause Oozie to fall back on the default URIHandler.  It seems the usage of the getAuthorityWithScheme() method is currently limited to enabling a determination of PUSH vs PULL from CoordCommandUtils.materializeDataEvents() and as such I would expect the impact would jut be that the PULL model would always be used even though the Kite URIHandler might wish to specify a PUSH model.

> URIHandlerService doesn't support URI schemes with query strings but no path segment
> ------------------------------------------------------------------------------------
>
>                 Key: OOZIE-1829
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1829
>             Project: Oozie
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 4.0.1
>            Reporter: Ben Roling
>
> While working on a prototype of integration between Oozie and the Kite SDK (see https://issues.cloudera.org/browse/CDK-385), I came to find that URIHandlerService.getAuthorityWithSchema(String uri) doesn't support URI schemes where there is a query string, but no path segment.
> I am currently prototyping Kite Dataset URIs and in my prototype, a Dataset URI for a dataset in a Hive/HCatalog DatasetRepository with managed Hive/HCatalog tables could look like this:
> repo:hive://localhost:9043?dataset-name=Person&partition-key=\[201405091300\]
> I am attempting to create an Oozie dataset around this Kite dataset and to make that happen I have implemented Oozie's URIHandler API for the Kite "repo" URI scheme.  When I attempted to run my first coordinator, it failed.
> The coordinator has the following dataset definition:
> {code}
>     <dataset name="Person" frequency="${coord:minutes(5)}" initial-instance="2014-04-24T00:00Z" timezone="UTC">
>       <uri-template>repo:hive://localhost:9083?dataset-name=Person&amp;partition-key=[${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}]
>       </uri-template>
>     </dataset>
> {code}
> This dataset is used as an output of the coordinator.
> When the coordinator is submitted it fails with the following exception:
> {code}
> 2014-05-09 10:57:34,991 ERROR org.apache.oozie.command.coord.CoordMaterializeTransitionXCommand: SERVER[localhost.localdomain] USER[cloudera] GROUP[-] TOKEN[] APP[Person-c] JOB[0000013-140508121805317-oozie-oozi-C] ACTION[-] Exception occurred:E0906: URI parsing error : repo:hive://localhost:9083?dataset-name=Person&partition-key=[${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}] Making the job failed 
> org.apache.oozie.dependency.URIHandlerException: E0906: URI parsing error : repo:hive://localhost:9083?dataset-name=Person&partition-key=[${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}]
> 	at org.apache.oozie.service.URIHandlerService.getAuthorityWithScheme(URIHandlerService.java:216)
> 	at org.apache.oozie.command.coord.CoordCommandUtils.materializeDataEvents(CoordCommandUtils.java:582)
> 	at org.apache.oozie.command.coord.CoordCommandUtils.materializeOneInstance(CoordCommandUtils.java:451)
> 	at org.apache.oozie.command.coord.CoordMaterializeTransitionXCommand.materializeActions(CoordMaterializeTransitionXCommand.java:386)
> 	at org.apache.oozie.command.coord.CoordMaterializeTransitionXCommand.materialize(CoordMaterializeTransitionXCommand.java:267)
> 	at org.apache.oozie.command.MaterializeTransitionXCommand.execute(MaterializeTransitionXCommand.java:72)
> 	at org.apache.oozie.command.MaterializeTransitionXCommand.execute(MaterializeTransitionXCommand.java:28)
> 	at org.apache.oozie.command.XCommand.call(XCommand.java:280)
> 	at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:744)
> Caused by: java.net.URISyntaxException: Illegal character in opaque part at index 63: repo:hive://localhost:9083?dataset-name=Person&partition-key=[${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}]
> 	at java.net.URI$Parser.fail(URI.java:2829)
> 	at java.net.URI$Parser.checkChars(URI.java:3002)
> 	at java.net.URI$Parser.parse(URI.java:3039)
> 	at java.net.URI.<init>(URI.java:595)
> 	at org.apache.oozie.service.URIHandlerService.getAuthorityWithScheme(URIHandlerService.java:209)
> 	... 11 more
> {code}
> The problem is that URIHandlerService.getAuthorityWithScheme(String uri) doesn't consider the possibility that the URI might have a query string and no path segment.  As a result, it ends up trying to create a URI from the entire URI template, which blows up due to $ in the template parameters.



--
This message was sent by Atlassian JIRA
(v6.2#6252)