You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Jens Jahnke (JIRA)" <ji...@apache.org> on 2014/10/23 15:52:34 UTC

[jira] [Created] (CONNECTORS-1081) Documentation: elasticsearch index creation.

Jens Jahnke created CONNECTORS-1081:
---------------------------------------

             Summary: Documentation: elasticsearch index creation.
                 Key: CONNECTORS-1081
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1081
             Project: ManifoldCF
          Issue Type: Improvement
          Components: Documentation
            Reporter: Jens Jahnke
            Priority: Minor


Hi,

this may be useful for the documentation.

Here are some simple steps for creating an elasticsearch index.
{code}
% curl -XPUT 'http://localhost:9200/manifoldcf'
% curl -XPUT 'http://localhost:9200/manifoldcf/attachment/_mapping' -d '
{
  "attachment" : {
    "_source" : {
      "excludes" : [ "file" ]
    },
    "properties": { 
      "allow_token_document" : { 
        "type" : "string" 
      },
      "allow_token_parent" : { 
        "type" : "string" 
      },
      "allow_token_share" : { 
        "type" : "string" 
      },
      "attributes" : {
        "type" : "string"
      },
      "createdOn" : {
        "type" : "string"
      },
      "deny_token_document" : {
        "type" : "string"
      },
      "deny_token_parent" : {
        "type" : "string"
      },
      "deny_token_share" : {
        "type" : "string"
      },
      "lastModified" : {
        "type" : "string"
      },
      "shareName" : {
        "type" : "string"
      },
      "file" : {
        "type" : "attachment",
        "path" : "full",
        "fields" : {
          "file" : {
            "store" : true,
            "term_vector" : "with_positions_offsets",
            "type" : "string"
          }
        }
      }
    }
  }
}'
{code}

This creates an index called {{manifoldcf}} with a mapping named {{attachment}} which has some generic fields for access tokens and a field {{file}} which makes use of the elasticsearch attachment mapper plugin. It is configured for highlighting ({{"term_vector" : "with_positions_offsets"}}).

The following part is useful for not saving the source json on the index which reduces the index size significantly. Be aware that you shouldn't do this if you need to re-index data on the elasticsearch side or you want access to the whole document.

{code}
"_source" : {
  "excludes" : [ "file" ]
},
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)