You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/11/22 05:18:59 UTC
[jira] [Commented] (APEXMALHAR-2181) Non-Transactional Prepared
Statement Based Cassandra Upsert (Update + Insert ) output Operator
[ https://issues.apache.org/jira/browse/APEXMALHAR-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15685731#comment-15685731 ]
ASF GitHub Bot commented on APEXMALHAR-2181:
--------------------------------------------
Github user asfgit closed the pull request at:
https://github.com/apache/apex-malhar/pull/466
> Non-Transactional Prepared Statement Based Cassandra Upsert (Update + Insert ) output Operator
> ----------------------------------------------------------------------------------------------
>
> Key: APEXMALHAR-2181
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2181
> Project: Apache Apex Malhar
> Issue Type: New Feature
> Reporter: Ananth
> Assignee: Ananth
>
> An abstract operator that is used to mutate cassandra rows using PreparedStatements for faster executions
> and accommodates EXACTLY_ONCE Semantics if concrete implementations choose to implement an abstract method with
> meaningful implementation (as Cassandra is not a pure transactional database , the burden is on the concrete
> implementation of the operator ONLY during the reconciliation window (and not for any other windows).
> ===========================================================
> The typical execution flow is as follows :
> 1. Create a concrete implementation of this class by extending this class and implement a few methods.
> 2. Define the payload that is the POJO that represents a Cassandra Row is part of this execution context
> {@link UpsertExecutionContext}. The payload is a template Parameter of this class
> 3. The Upstream operator that wants to write to Cassandra does the following
> a. Create an instance of {@link UpsertExecutionContext}
> b. Set the payload ( an instance of the POJO created as step two above )
> c. Set additional execution context parameters like CollectionHandling style, List placement Styles
> overriding TTLs, Update only if Primary keys exist and Consistency Levels etc.
> 4. The concrete implementation would then execute this context as a cassandra row mutation
> ===========================================================
> This operator supports the following features
> 1. Highly customizable Connection policies. This is achieved by specifying the ConnectionStateManager.
> There are a good number of connection management aspects that can be
> controlled via {@link ConnectionStateManager} like consistency, load balancing, connection retries,
> table to use, keyspace to use etc. Please refer javadoc of {@link ConnectionStateManager}
> 2. Support for Collections : Map, List and Sets are supported
> User Defined types as part of collections is also supported.
> 3. Support exists for both adding to an existing collection or removing entries from an existing collection.
> The POJO field that represents a collection is used to represent the collection that is added or removed.
> Thus this can be used to avoid a pattern of read and then write the final value into the cassandra column
> which can be used for low latency / high write pattern applications as we can avoid a read in the process.
> 4. Supports List Placements : The execution context can be used to specify where the new incoming list
> is to be added ( in case there is an existing list in the current column of the current row being mutated.
> Supported options are APPEND or PREPEND to an existing list
> 5. Support for User Defined Types. A pojo can have fields that represent the Cassandra Columns that are custom
> user defined types. Concrete implementations of the operator provide a mapping of the cassandra column name
> to the TypeCodec that is to be used for that field inside cassandra. Please refer javadocs of
> {@link this.getCodecsForUserDefinedTypes() } for more details
> 6. Support for custom mapping of POJO payload field names to that of cassandra columns. Practically speaking,
> POJO field names might not always match with Cassandra Column names and hence this support. This will also avoid
> writing a POJO just for the cassandra operator and thus an existing POJO can be passed around to this operator.
> Please refer javadoc {@link this.getPojoFieldNameToCassandraColumnNameOverride()} for an example
> 7. TTL support - A default TTL can be set for the Connection ( via {@link ConnectionStateManager} and then used
> for all mutations. This TTL can further be overridden at a tuple execution level to accomodate use cases of
> setting custom column expirations typically useful in wide row implementations.
> 8. Support for Counter Column tables. Counter tables are also supported with the values inside the incoming
> POJO added/subtracted from the counter column accordingly. Please note that the value is not absolute set but
> rather representing the value that needs to be added to or subtracted from the current counter.
> 9. Support for Composite Primary Keys is also supported. All the POJO fields that map to the composite
> primary key are used to resolve the primary key in case of a Composite Primary key table
> 10. Support for conditional updates : This operator can be used as an Update Only operator as opposed to an
> Upsert operator. i.e. Update only IF EXISTS . This is achieved by setting the appropriate boolean in the
> {@link UpsertExecutionContext} tuple that is passed from the upstream operator.
> 11. Lenient mapping of POJO fields to Cassandra column names. By default the POJO field names are case insensitive
> to cassandra column names. This can be further enhanced by over-riding mappings. Please refer feature 6 above.
> 12. Defaults can be overridden at at tuple execution level for TTL & Consistency Policies
> 13. Support for handling Nulls i.e. whether null values in the POJO are to be persisted as is or to be ignored so
> that the application need not perform a read to populate a POJO field if it is not available in the context
> 14. A few autometrics are provided for monitoring the latency aspects of the cassandra cluster
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)