You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Adam Lally (JIRA)" <ui...@incubator.apache.org> on 2008/06/25 16:08:45 UTC

[jira] Commented: (UIMA-1080) [Patch] Wrong usage of URL in XmiWriterCasConsumer

    [ https://issues.apache.org/jira/browse/UIMA-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608048#action_12608048 ] 

Adam Lally commented on UIMA-1080:
----------------------------------

This doesn't seem to handle spaces in the file path.  For example if you run the document analyzer with this input dir:
C:\Program Files\apache-uima\examples\data

Then the output files are produced with the generic names doc0, doc1, etc., indicating that the filename wasn't extracted from the URI.  As I recall, the URI class is much less lenient than the URL class when it comes to spaces.

This might be considered a problem with the FileSystemCollectionReader, which populates the SourceDocumenInformation.uri field.  Perhaps it should not be putting spaces in there.  However, I am somewhat nervous about changing this to URL-encode the uri, since I think it is likely there's some user code out there that is relying on the  current behavior.

Also, whatever change is applied to XmiWriterCasConsumer probably should also be applied to XCasWriterCasConsumer.  And there are also example versions of these classes in the uimaj-examples project.

> [Patch] Wrong usage of URL in XmiWriterCasConsumer
> --------------------------------------------------
>
>                 Key: UIMA-1080
>                 URL: https://issues.apache.org/jira/browse/UIMA-1080
>             Project: UIMA
>          Issue Type: Improvement
>          Components: InternalTools
>    Affects Versions: 2.2.2
>            Reporter: Richard Eckart
>            Priority: Minor
>         Attachments: UIMA-1080.patch
>
>
> The XmiWriterCasConsumer wraps the value of SourceDocumentInformation.getUri() in an URL to extract the path. This only works if the value returned by getUri() is actually an URL starting with http, ftp or some other known protocol. It does not work if a framework user puts some self-defined URIs in there, such as annolab://default/myfile. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.