You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@manifoldcf.apache.org by "Karl Wright (JIRA)" <ji...@apache.org> on 2014/06/27 15:26:24 UTC

[jira] [Updated] (CONNECTORS-917) SharePoint connector would benefit from site discovery

     [ https://issues.apache.org/jira/browse/CONNECTORS-917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karl Wright updated CONNECTORS-917:
-----------------------------------

    Fix Version/s:     (was: ManifoldCF 1.7)
                   ManifoldCF 2.0

Punting to 2.0

> SharePoint connector would benefit from site discovery
> ------------------------------------------------------
>
>                 Key: CONNECTORS-917
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-917
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: SharePoint connector
>    Affects Versions: ManifoldCF 1.7
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 2.0
>
>
> The current SharePoint connector only can crawl a single SharePoint site.  But SharePoint can support multiple sites.  Indeed, in some cases there are hundreds of such sites.  Setting up a connection and jobs for each one would be a difficult task.
> The SharePoint admin site allows you to discover the sites that exist.  Using this feature as part of the crawl would allow for a much more automated way of handling large SharePoint installations.
> Some notes:
>    - Not yet clear how "one site" vs. "many sites" should coexist in one connector
>      - Form of document identifier must change
>      - Each document identifier must include the site path first
>      - Since subsite path can be just "/", also needs to be resilient against that
>      - Something like: <site_path>//<current_subsite_doc_list_item_etc_path>.  But "//" will collide with old-style.
>      - If old-style document identifier always must start with a "/", then we can simply start it with (say) a "+", to signal that it is a new-style identifier
>      - Not clear yet if there's a new form that would allow us to know if a doc identifier was old form or not
>    - Native authority also right now needs to know what site it is working with
>      - Site discovery therefore must also be run in the authority, and tokens for each discovered site must be returned
>      - Native tokens must therefore be qualified with a site ID



--
This message was sent by Atlassian JIRA
(v6.2#6252)