You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Steph van Schalkwyk (JIRA)" <ji...@apache.org> on 2018/11/02 00:57:00 UTC

[jira] [Commented] (CONNECTORS-1529) Add "url" output element to ES Output Connector (required when used with the Web Repository Connector)

    [ https://issues.apache.org/jira/browse/CONNECTORS-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672423#comment-16672423 ] 

Steph van Schalkwyk commented on CONNECTORS-1529:
-------------------------------------------------

I have added the "documentId":  metatag to the Web Connector.
 * *"documentId": ["http://localhost:8000/000010.pdf"|http://localhost:8000/000010.pdf]***

*Will this work for everybody?*

*Steph*

 

 

 
 * "_index": "index_cpt_all",
 * "_type": "catalogline",
 * "_id": ["http://localhost:8000/000010.pdf"|http://localhost:8000/000010.pdf],
 * "_version": 1,
 * "_score": 1,
 * "_source": {
 ** "date": "2005-05-05T21:19:55Z",
 ** "pdf:PDFVersion": "1.3",
 ** "pdf:docinfo:title": "Microsoft Word - 48428.doc",
 ** "xmp:CreatorTool": "PScript5.dll Version 5.2",
 ** "Server": "SimpleHTTP/0.6 Python/3.5.2",
 ** "access_permission:modify_annotations": "true",
 ** "access_permission:can_print_degraded": "true",
 ** "dc:creator": "edocslib",
 ** "dcterms:created": "2005-05-05T21:19:55Z",
 ** "Last-Modified": "2005-05-05T21:19:55Z",
 ** "dcterms:modified": "2005-05-05T21:19:55Z",
 ** "dc:format": "application/pdf; version=1.3",
 ** "title": "Microsoft Word - 48428.doc",
 ** "Last-Save-Date": "2005-05-05T21:19:55Z",
 ** "pdf:docinfo:creator_tool": "PScript5.dll Version 5.2",
 ** "access_permission:fill_in_form": "true",
 ** "pdf:docinfo:modified": "2005-05-05T21:19:55Z",
 ** "stream_name": "000010.pdf",
 ** "meta:save-date": "2005-05-05T21:19:55Z",
 ** "pdf:encrypted": "false",
 ** "dc:title": "Microsoft Word - 48428.doc",
 ** "modified": "2005-05-05T21:19:55Z",
 ** "Content-Length": "120441",
 ** "Content-Type": "application/pdf",
 ** "stream_size": "120441",
 ** "pdf:docinfo:creator": "edocslib",
 ** "X-Parsed-By": "org.apache.tika.parser.DefaultParser",
 ** "creator": "edocslib",
 ** "meta:author": "edocslib",
 ** "meta:creation-date": "2005-05-05T21:19:55Z",
 ** "created": "Thu May 05 16:19:55 CDT 2005",
 ** "documentId": ["http://localhost:8000/000010.pdf"|http://localhost:8000/000010.pdf],
 ** "access_permission:extract_for_accessibility": "true",
 ** "access_permission:assemble_document": "true",
 ** "xmpTPg:NPages": "4",
 ** "Creation-Date": "2005-05-05T21:19:55Z",
 ** "resourceName": "000010.pdf",
 ** "access_permission:extract_content": "true",
 ** "access_permission:can_print": "true",
 ** "Content-type": "application/pdf",
 ** "Author": "edocslib",
 ** "producer": "Acrobat Distiller 5.0 (Windows)",
 ** "access_permission:can_modify": "true",
 ** "pdf:docinfo:producer": "Acrobat Distiller 5.0 (Windows)",
 ** "pdf:docinfo:created": "2005-05-05T21:19:55Z",
 ** "indexed": "2018-11-02T00:50:48.053+0000",
 ** "mime-type": "application/pdf",
 ** "allow_token_document": "__nosecurity__",
 ** "deny_token_document": "__nosecurity__",
 ** "allow_token_share": "__nosecurity__",
 ** "deny_token_share": "__nosecurity__",
 ** "allow_token_parent": "__nosecurity__",
 ** "deny_token_parent": "__nosecurity__",
 ** "content": " Federal Communications Commission DA 05

> Add "url" output element to ES Output Connector (required when used with the Web Repository Connector)
> ------------------------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-1529
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1529
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: Elastic Search connector
>    Affects Versions: ManifoldCF 2.10
>            Reporter: Steph van Schalkwyk
>            Assignee: Steph van Schalkwyk
>            Priority: Major
>             Fix For: ManifoldCF 2.12
>
>         Attachments: elasticsearch.patch, image-2018-09-06-10-28-45-008.png
>
>
> Add "url" (copy of the _id field) to ES Output.
> ES no longer supports copying from _id (copy-to) in the schema.
> As per 
> !image-2018-09-06-10-28-45-008.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)