You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafodion.apache.org by Devaraj Das <dd...@hortonworks.com> on 2017/06/28 21:24:16 UTC

Secondary index maintenance/management code in Trafodion..

Hi folks, I am curious to know how Trafodion handles secondary-indices' management and maintenance for data table stored in hbase. Can you please point me to the design/code for such.

Thanks

Devaraj

RE: Secondary index maintenance/management code in Trafodion..

Posted by Dave Birdsall <da...@esgyn.com>.
Hi,

The transaction is initiated by the client, but transaction processing happens at various places in the stack.

A "tm" process on the node where the client is co-ordinates the transaction, managing the details of the two-phase commit protocol.

Each HBase RegionServer functions as a resource manager within two-phase commit.

A Trafodion co-processor runs in each RegionServer managing the details of a given transaction for affected rows stored at that RegionServer. Changed rows are kept in memory until transaction commit.

Trafodion uses the MVCC model for transactions. Rows are not changed in the HBase tables until transaction commit time.

Dave

-----Original Message-----
From: Devaraj Das [mailto:ddas@hortonworks.com] 
Sent: Wednesday, June 28, 2017 2:45 PM
To: dev@trafodion.incubator.apache.org
Subject: Re: Secondary index maintenance/management code in Trafodion..

Hi Dave, thanks for the nice details. Just to be sure I get it right, the data/index transactions for INSERTs/UPDATEs gets executed on the client side, right?
________________________________________
From: Dave Birdsall <da...@esgyn.com>
Sent: Wednesday, June 28, 2017 2:34 PM
To: dev@trafodion.incubator.apache.org
Subject: RE: Secondary index maintenance/management code in Trafodion..

Hi,

Secondary indexes are managed by the Trafodion engine itself; HBase is not aware of them.

From an HBase perspective, a Trafodion secondary index is simply another HBase table.

The Trafodion engine provides a suite of DDL for the management of secondary indexes. One can create them (via CREATE INDEX), drop them (via DROP INDEX) and alter certain attributes (via ALTER INDEX). These DDL operations are documented in the Trafodion SQL Reference manual, here: http://trafodion.apache.org/docs/sql_reference/index.html

From a DML perspective, the Trafodion compiler and optimizer decide when to use a particular secondary index. An INSERT operation of course will do an insert into all secondary indexes as well as the base table. Similarly for a DELETE. For UPDATEs, only the affected indexes are operated upon. Index maintenance is done in parallel as much as possible (for example, we pipeline rows from the base table to the indexes, and we update the indexes themselves in parallel). Of course, all of  this index maintenance is done under a transaction, so we maintain consistency between indexes and base table.

For SELECTs, the Trafodion optimizer will pick which secondary indexes to access based upon cost. Several possibilities exist: It may do an index-only access (if all the relevant columns happen to be available in some secondary index), a base table only access (if no secondary index is relevant or cheap) or a join of a secondary index to the base table (e.g. if not all of the relevant columns are in the index, but we can get direct access to the index and then use those keys to get direct access to the base table).

Hope this helps,

Dave

-----Original Message-----
From: Devaraj Das [mailto:ddas@hortonworks.com]
Sent: Wednesday, June 28, 2017 2:24 PM
To: dev@trafodion.incubator.apache.org
Subject: Secondary index maintenance/management code in Trafodion..

Hi folks, I am curious to know how Trafodion handles secondary-indices' management and maintenance for data table stored in hbase. Can you please point me to the design/code for such.

Thanks

Devaraj



Re: Secondary index maintenance/management code in Trafodion..

Posted by Devaraj Das <dd...@hortonworks.com>.
Hi Dave, thanks for the nice details. Just to be sure I get it right, the data/index transactions for INSERTs/UPDATEs gets executed on the client side, right?
________________________________________
From: Dave Birdsall <da...@esgyn.com>
Sent: Wednesday, June 28, 2017 2:34 PM
To: dev@trafodion.incubator.apache.org
Subject: RE: Secondary index maintenance/management code in Trafodion..

Hi,

Secondary indexes are managed by the Trafodion engine itself; HBase is not aware of them.

From an HBase perspective, a Trafodion secondary index is simply another HBase table.

The Trafodion engine provides a suite of DDL for the management of secondary indexes. One can create them (via CREATE INDEX), drop them (via DROP INDEX) and alter certain attributes (via ALTER INDEX). These DDL operations are documented in the Trafodion SQL Reference manual, here: http://trafodion.apache.org/docs/sql_reference/index.html

From a DML perspective, the Trafodion compiler and optimizer decide when to use a particular secondary index. An INSERT operation of course will do an insert into all secondary indexes as well as the base table. Similarly for a DELETE. For UPDATEs, only the affected indexes are operated upon. Index maintenance is done in parallel as much as possible (for example, we pipeline rows from the base table to the indexes, and we update the indexes themselves in parallel). Of course, all of  this index maintenance is done under a transaction, so we maintain consistency between indexes and base table.

For SELECTs, the Trafodion optimizer will pick which secondary indexes to access based upon cost. Several possibilities exist: It may do an index-only access (if all the relevant columns happen to be available in some secondary index), a base table only access (if no secondary index is relevant or cheap) or a join of a secondary index to the base table (e.g. if not all of the relevant columns are in the index, but we can get direct access to the index and then use those keys to get direct access to the base table).

Hope this helps,

Dave

-----Original Message-----
From: Devaraj Das [mailto:ddas@hortonworks.com]
Sent: Wednesday, June 28, 2017 2:24 PM
To: dev@trafodion.incubator.apache.org
Subject: Secondary index maintenance/management code in Trafodion..

Hi folks, I am curious to know how Trafodion handles secondary-indices' management and maintenance for data table stored in hbase. Can you please point me to the design/code for such.

Thanks

Devaraj



RE: Secondary index maintenance/management code in Trafodion..

Posted by Dave Birdsall <da...@esgyn.com>.
Hi,

Secondary indexes are managed by the Trafodion engine itself; HBase is not aware of them.

From an HBase perspective, a Trafodion secondary index is simply another HBase table.

The Trafodion engine provides a suite of DDL for the management of secondary indexes. One can create them (via CREATE INDEX), drop them (via DROP INDEX) and alter certain attributes (via ALTER INDEX). These DDL operations are documented in the Trafodion SQL Reference manual, here: http://trafodion.apache.org/docs/sql_reference/index.html

From a DML perspective, the Trafodion compiler and optimizer decide when to use a particular secondary index. An INSERT operation of course will do an insert into all secondary indexes as well as the base table. Similarly for a DELETE. For UPDATEs, only the affected indexes are operated upon. Index maintenance is done in parallel as much as possible (for example, we pipeline rows from the base table to the indexes, and we update the indexes themselves in parallel). Of course, all of  this index maintenance is done under a transaction, so we maintain consistency between indexes and base table.

For SELECTs, the Trafodion optimizer will pick which secondary indexes to access based upon cost. Several possibilities exist: It may do an index-only access (if all the relevant columns happen to be available in some secondary index), a base table only access (if no secondary index is relevant or cheap) or a join of a secondary index to the base table (e.g. if not all of the relevant columns are in the index, but we can get direct access to the index and then use those keys to get direct access to the base table).

Hope this helps,

Dave

-----Original Message-----
From: Devaraj Das [mailto:ddas@hortonworks.com] 
Sent: Wednesday, June 28, 2017 2:24 PM
To: dev@trafodion.incubator.apache.org
Subject: Secondary index maintenance/management code in Trafodion..

Hi folks, I am curious to know how Trafodion handles secondary-indices' management and maintenance for data table stored in hbase. Can you please point me to the design/code for such.

Thanks

Devaraj