You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metamodel.apache.org by kaspersorensen <gi...@git.apache.org> on 2016/04/26 17:24:28 UTC

[GitHub] metamodel pull request: ElasticSearch bulk support and mapping cre...

GitHub user kaspersorensen opened a pull request:

    https://github.com/apache/metamodel/pull/98

    ElasticSearch bulk support and mapping creation using Map

    This branch adds support for two things... 
    
     * Creation of elastic search mappings (document types) using a JSON object instead of array. This was discussed on the mailing list in thread "Approach to submitting ElasticSearch mapping".
     * Support for bulk operations in the REST elasticsearch connector
    
    Sorry, it should have been two PRs, but I did it all at once - please let me know if you need it to be split up.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kaspersorensen/metamodel feature/elasticsearch-bulk-and-mapping

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/metamodel/pull/98.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #98
    
----
commit 1662b303650fd69b26cc8f06bf5e8b6b241afde6
Author: kaspersorensen <i....@gmail.com>
Date:   2016-04-25T23:16:29Z

    Added support for bulk operations in REST ElasticSearch module

commit 148b236899611418516574f7ac54789a17a0fbf1
Author: kaspersorensen <i....@gmail.com>
Date:   2016-04-26T15:21:22Z

    Added bulk error messaging for failed items

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metamodel pull request: ElasticSearch bulk support and mapping cre...

Posted by kaspersorensen <gi...@git.apache.org>.
Github user kaspersorensen commented on a diff in the pull request:

    https://github.com/apache/metamodel/pull/98#discussion_r61196721
  
    --- Diff: elasticsearch/common/src/main/java/org/apache/metamodel/elasticsearch/common/ElasticSearchUtils.java ---
    @@ -95,17 +103,49 @@ public static QueryBuilder getExistsQuery(String fieldName) {
                     // do nothing - the ID is a client-side construct
                     continue;
                 }
    -            sourceProperties.add(columnName);
    +            
    +            final String fieldName = getValidatedFieldName(columnName);
    +            final Map<String, String> propertyMap = new HashMap<>();
    +            final String type = getType(column);
    +            propertyMap.put("type", type);
    +//            propertyMap.put("store", "true");
    --- End diff --
    
    Ah good that you mentioned it. I couldn't find the "store" attribute in newer versions of the ElasticSearch REST documentation [1]. So I think it won't work anymore ... My guess (at least my evidence so far when working with this) is that there isn't an option anymore to not store field values.
    
    [1] https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metamodel pull request: ElasticSearch bulk support and mapping cre...

Posted by LosD <gi...@git.apache.org>.
Github user LosD commented on a diff in the pull request:

    https://github.com/apache/metamodel/pull/98#discussion_r61166439
  
    --- Diff: elasticsearch/common/src/main/java/org/apache/metamodel/elasticsearch/common/ElasticSearchUtils.java ---
    @@ -78,15 +86,15 @@ public static QueryBuilder getExistsQuery(String fieldName) {
             return getFilteredQuery("exists", fieldName);
         }
     
    -    public static List<Object> getSourceProperties(final MutableTable table) {
    +    public static Object getMappingSource(final MutableTable table) {
    --- End diff --
    
    Why not `Map` here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metamodel pull request: ElasticSearch bulk support and mapping cre...

Posted by LosD <gi...@git.apache.org>.
Github user LosD commented on a diff in the pull request:

    https://github.com/apache/metamodel/pull/98#discussion_r61165846
  
    --- Diff: elasticsearch/common/src/main/java/org/apache/metamodel/elasticsearch/common/ElasticSearchUtils.java ---
    @@ -95,17 +103,49 @@ public static QueryBuilder getExistsQuery(String fieldName) {
                     // do nothing - the ID is a client-side construct
                     continue;
                 }
    -            sourceProperties.add(columnName);
    +            
    +            final String fieldName = getValidatedFieldName(columnName);
    +            final Map<String, String> propertyMap = new HashMap<>();
    +            final String type = getType(column);
    +            propertyMap.put("type", type);
    +//            propertyMap.put("store", "true");
    --- End diff --
    
    How come this is commented out? Don't we want to ensure storage?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metamodel pull request: ElasticSearch bulk support and mapping cre...

Posted by LosD <gi...@git.apache.org>.
Github user LosD commented on a diff in the pull request:

    https://github.com/apache/metamodel/pull/98#discussion_r61165997
  
    --- Diff: elasticsearch/rest/pom.xml ---
    @@ -27,7 +27,7 @@ under the License.
     	<modelVersion>4.0.0</modelVersion>
     
     	<properties>
    -		<jest.version>0.1.7</jest.version>
    +		<jest.version>2.0.2</jest.version>
    --- End diff --
    
    Oh, nice bump :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metamodel pull request: ElasticSearch bulk support and mapping cre...

Posted by kaspersorensen <gi...@git.apache.org>.
Github user kaspersorensen commented on the pull request:

    https://github.com/apache/metamodel/pull/98#issuecomment-218943841
  
    I've updated this PR. Maybe you can check it again @LosD ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metamodel pull request: ElasticSearch bulk support and mapping cre...

Posted by LosD <gi...@git.apache.org>.
Github user LosD commented on the pull request:

    https://github.com/apache/metamodel/pull/98#issuecomment-218965572
  
    LGTM! :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metamodel pull request: ElasticSearch bulk support and mapping cre...

Posted by kaspersorensen <gi...@git.apache.org>.
Github user kaspersorensen commented on a diff in the pull request:

    https://github.com/apache/metamodel/pull/98#discussion_r61195959
  
    --- Diff: elasticsearch/common/src/main/java/org/apache/metamodel/elasticsearch/common/ElasticSearchUtils.java ---
    @@ -78,15 +86,15 @@ public static QueryBuilder getExistsQuery(String fieldName) {
             return getFilteredQuery("exists", fieldName);
         }
     
    -    public static List<Object> getSourceProperties(final MutableTable table) {
    +    public static Object getMappingSource(final MutableTable table) {
    --- End diff --
    
    Hmm I chose for Object because I felt that the data type was irrelevant - or even an implementation detail. The 'source' object is some kind of magical object anyway without much type safety. But on the other hand, it seems that the `source(Map)` method is different in behaviour from `source(Object)` so that's a good case for changing this method to return map.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metamodel pull request: ElasticSearch bulk support and mapping cre...

Posted by kaspersorensen <gi...@git.apache.org>.
Github user kaspersorensen commented on the pull request:

    https://github.com/apache/metamodel/pull/98#issuecomment-214782257
  
    Correction:
    
    It's actually also a third thing that is added, namely stripping of illegal field name characters, see METAMODEL-246


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metamodel pull request: ElasticSearch bulk support and mapping cre...

Posted by LosD <gi...@git.apache.org>.
Github user LosD commented on the pull request:

    https://github.com/apache/metamodel/pull/98#issuecomment-214892259
  
    Code looks good to me, though there was a few surprises.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metamodel pull request: ElasticSearch bulk support and mapping cre...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/metamodel/pull/98


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] metamodel pull request: ElasticSearch bulk support and mapping cre...

Posted by LosD <gi...@git.apache.org>.
Github user LosD commented on a diff in the pull request:

    https://github.com/apache/metamodel/pull/98#discussion_r61166829
  
    --- Diff: elasticsearch/native/src/main/java/org/apache/metamodel/elasticsearch/nativeclient/ElasticSearchCreateTableBuilder.java ---
    @@ -44,15 +44,20 @@ public ElasticSearchCreateTableBuilder(ElasticSearchUpdateCallback updateCallbac
         @Override
         public Table execute() throws MetaModelException {
             final MutableTable table = getTable();
    -        final List<Object> sourceProperties = ElasticSearchUtils.getSourceProperties(table);
    +        final Object source = ElasticSearchUtils.getMappingSource(table);
     
             final ElasticSearchDataContext dataContext = getUpdateCallback().getDataContext();
             final IndicesAdminClient indicesAdmin = dataContext.getElasticSearchClient().admin().indices();
             final String indexName = dataContext.getIndexName();
     
             final PutMappingRequestBuilder requestBuilder = new PutMappingRequestBuilder(indicesAdmin)
                     .setIndices(indexName).setType(table.getName());
    -        requestBuilder.setSource(sourceProperties.toArray());
    +        if (source instanceof Map) {
    --- End diff --
    
    I'm obviously missing something, but it seems to me to always be a `Map`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---