You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Roger Hoover (JIRA)" <ji...@apache.org> on 2015/07/23 19:48:04 UTC
[jira] [Updated] (SAMZA-741) Add support for versioning to
Elasticsearch System Producer
[ https://issues.apache.org/jira/browse/SAMZA-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Roger Hoover updated SAMZA-741:
-------------------------------
Description:
Versioning (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) lets you prevent duplicate messages from temporarily overwriting new versions of a document with old ones.
Currently, the Elasticsearch system producer does not support setting versions. Since Kafka/Samza don't support message metadata besides a key (I think), the best approach seems to be to embed metadata into the stream name.
We can add a version and version_type as options to the stream name. These match up with Elasticsearch REST API (https://www.elastic.co/blog/elasticsearch-versioning-support)
{noformat}
{index-name}/{type-name}?version={version-id}&version_type={version-type}
{noformat}
was:
Versioning (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) lets you prevent duplicate messages from temporarily overwriting new versions of a document with old ones.
Currently, the Elasticsearch system producer does not support setting versions. Since Kafka/Samza don't support message metadata besides a key (I think), the best approach seems to be to embed metadata into the stream name.
We can add a version and version_type as options to the stream name. These match up with Elasticsearch REST API (https://www.elastic.co/blog/elasticsearch-versioning-support)
{index-name}/{type-name}?version={version-id}&version_type={version-type}
> Add support for versioning to Elasticsearch System Producer
> -----------------------------------------------------------
>
> Key: SAMZA-741
> URL: https://issues.apache.org/jira/browse/SAMZA-741
> Project: Samza
> Issue Type: Improvement
> Reporter: Roger Hoover
> Priority: Minor
> Fix For: 0.10.0
>
>
> Versioning (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) lets you prevent duplicate messages from temporarily overwriting new versions of a document with old ones.
> Currently, the Elasticsearch system producer does not support setting versions. Since Kafka/Samza don't support message metadata besides a key (I think), the best approach seems to be to embed metadata into the stream name.
> We can add a version and version_type as options to the stream name. These match up with Elasticsearch REST API (https://www.elastic.co/blog/elasticsearch-versioning-support)
> {noformat}
> {index-name}/{type-name}?version={version-id}&version_type={version-type}
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)