You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Karl Wright (JIRA)" <ji...@apache.org> on 2010/10/13 17:32:36 UTC

[jira] Commented: (CONNECTORS-118) Crawled archive files should be expanded into their constituent files

    [ https://issues.apache.org/jira/browse/CONNECTORS-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920609#action_12920609 ] 

Karl Wright commented on CONNECTORS-118:
----------------------------------------

The key question here is how you describe the component of an archive.  There must be a URL to describe it, or there is no way the search results are going to mean anything.

Since URL's are the connector's job to assemble, this is likely to be connector specific.  Also, most connectors will never be dealing with archives.  Can you provide a list of connectors where you believe this is important, and what the URL's to get at the subpieces of the archive look like?


> Crawled archive files should be expanded into their constituent files
> ---------------------------------------------------------------------
>
>                 Key: CONNECTORS-118
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-118
>             Project: ManifoldCF
>          Issue Type: New Feature
>          Components: Framework crawler agent
>            Reporter: Jack Krupansky
>
> Archive files such as zip, mbox, tar, etc. should be expanded into their constituent files during crawling of repositories so that any output connector would output the flattened archive.
> This could be an option, defaulted to ON, since someone may want to implement a "copy" connector that maintains crawled files as-is.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.