You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Roger Hoover (JIRA)" <ji...@apache.org> on 2015/07/23 19:13:05 UTC

[jira] [Created] (SAMZA-741) Add support for versioning to Elasticsearch System Producer

Roger Hoover created SAMZA-741:
----------------------------------

             Summary: Add support for versioning to Elasticsearch System Producer
                 Key: SAMZA-741
                 URL: https://issues.apache.org/jira/browse/SAMZA-741
             Project: Samza
          Issue Type: Improvement
            Reporter: Roger Hoover
            Priority: Minor
             Fix For: 0.10.0


Versioning (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) lets you prevent duplicate messages from temporarily overwriting new versions of a document with old ones.

Currently, the Elasticsearch system producer does not support setting versions.  Since Kafka/Samza don't support message metadata besides a key (I think), the best approach seems to be to embed metadata into the stream name.

We can add a version and version_type as options to the stream name.  These match up with Elasticsearch REST API (https://www.elastic.co/blog/elasticsearch-versioning-support)

{index-name}/{type-name}?version={version-id}&version_type={version-type}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)