You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafodion.apache.org by Dave Birdsall <da...@esgyn.com> on 2017/04/12 19:53:40 UTC

Embedding an insert/select inside an insert/select?

Hi,

Trafodion supports a notion of embedded IUD statements. I can, for example, do:

Select * from (insert into t2 select * from t1) as tx where tx.a = 5;

This statement inserts all the rows from t1 into t2, and returns those rows where A = 5 as its result.

Just for fun, I tried the following:

Insert into t3 select * from (insert into t2 select * from t1) as tx where tx.a = 5;

This fails with the following error:

*** ERROR[4171] Embedded INSERT statements are not supported in INSERT statements.

This comes from GenericUpdate::bindNode in optimizer/BindRelExpr.cpp.

Just for fun, I commented out the code that raises this error. I got some reasonable plans. For example:

>>prepare ss from insert into t3 select * from (insert into t2 select * from t1)
+> as tx where tx.a = 5;

--- SQL command prepared.
>>explain options 'f' ss;

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
---- ---- ---- --------------------  --------  --------------------  ---------

8    .    9    root                            x                     5.00E+005
7    .    8    esp_exchange                    1:4(hash2)            5.00E+005
5    6    7    tuple_flow                                            5.00E+005
.    .    6    trafodion_insert                T3                    1.00E+000
4    .    5    sort                                                  5.00E+005
2    3    4    nested_join                                           5.00E+005
.    .    3    trafodion_insert                T2                    9.99E-006
1    .    2    sort                                                  1.00E+006
.    .    1    trafodion_scan                  T1                    1.00E+006

--- SQL operation complete.
>>

In this example, T1 has 1 million rows. T1, T2 and T3 are simple tables with just two integer columns, A and B, primary key A. I had done UPDATE STATISTICS on t1 after populating it. I salted each table 4 ways on the primary key.

This looks like a nice parallel plan, though I wonder about the need for the sort nodes, since the partitions should line up.

This is a lot of background for my question: Why does the binder forbid such statements? Are there some holes in the handling of nested IUDs that need to be filled before such a feature is turned on?

Thanks,

Dave



RE: Embedding an insert/select inside an insert/select?

Posted by Selva Govindarajan <se...@esgyn.com>.
I have seen Trafodion optimizer uses sort, a blocking operator for self-referencing updates to provide before image.

Selva 

-----Original Message-----
From: Rohit Jain [mailto:rohit.jain@esgyn.com] 
Sent: Wednesday, April 12, 2017 1:46 PM
To: dev@trafodion.incubator.apache.org
Subject: RE: Embedding an insert/select inside an insert/select?

Unless the keys are the same and so is the partitioning, then a sort would be efficient, though does that matter as much for HBase as it does a B-tree?

-----Original Message-----
From: Dave Birdsall [mailto:dave.birdsall@esgyn.com] 
Sent: Wednesday, April 12, 2017 3:24 PM
To: dev@trafodion.incubator.apache.org
Subject: RE: Embedding an insert/select inside an insert/select?

Good thought. I did an experiment. With the 4171 error check commented out, I did:

>>prepare s1 from insert into t1 select * from (insert into t2 select * from t1)
+> as tx;

*** WARNING[6008] Statistics for column (A) from table TRAFODION.SEABASE.T2 were not available. As a result, the access path chosen might not be the best possible.

--- SQL command prepared.
>>explain options 'f' s1;

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
---- ---- ---- --------------------  --------  --------------------  ---------

8    .    9    root                            x                     1.00E+006
7    .    8    esp_exchange                    1:4(hash2)            1.00E+006
5    6    7    tuple_flow                                            1.00E+006
.    .    6    trafodion_insert                T1                    1.00E+000
4    .    5    sort                                                  1.00E+006
2    3    4    nested_join                                           1.00E+006
.    .    3    trafodion_insert                T2                    1.00E+000
1    .    2    sort                                                  1.00E+006
.    .    1    trafodion_scan                  T1                    1.00E+006

--- SQL operation complete.

I'm guessing this would work correctly because the sort operators would block until the scans are done. In fact, that might be why the sort operators are there (I had wondered about that in my original e-mail).

Dave


-----Original Message-----
From: Rohit Jain [mailto:rohit.jain@esgyn.com] 
Sent: Wednesday, April 12, 2017 1:19 PM
To: dev@trafodion.incubator.apache.org
Subject: RE: Embedding an insert/select inside an insert/select?

Maybe because it will have to verify that none of the outer inserts are into the same table(s) as the inner insert(s) so that you don't run into a Halloween problem perhaps?

Rohit

-----Original Message-----
From: Dave Birdsall [mailto:dave.birdsall@esgyn.com] 
Sent: Wednesday, April 12, 2017 2:54 PM
To: dev@trafodion.incubator.apache.org
Subject: Embedding an insert/select inside an insert/select?

Hi,

Trafodion supports a notion of embedded IUD statements. I can, for example, do:

Select * from (insert into t2 select * from t1) as tx where tx.a = 5;

This statement inserts all the rows from t1 into t2, and returns those rows where A = 5 as its result.

Just for fun, I tried the following:

Insert into t3 select * from (insert into t2 select * from t1) as tx where tx.a = 5;

This fails with the following error:

*** ERROR[4171] Embedded INSERT statements are not supported in INSERT statements.

This comes from GenericUpdate::bindNode in optimizer/BindRelExpr.cpp.

Just for fun, I commented out the code that raises this error. I got some reasonable plans. For example:

>>prepare ss from insert into t3 select * from (insert into t2 select * from t1)
+> as tx where tx.a = 5;

--- SQL command prepared.
>>explain options 'f' ss;

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
---- ---- ---- --------------------  --------  --------------------  ---------

8    .    9    root                            x                     5.00E+005
7    .    8    esp_exchange                    1:4(hash2)            5.00E+005
5    6    7    tuple_flow                                            5.00E+005
.    .    6    trafodion_insert                T3                    1.00E+000
4    .    5    sort                                                  5.00E+005
2    3    4    nested_join                                           5.00E+005
.    .    3    trafodion_insert                T2                    9.99E-006
1    .    2    sort                                                  1.00E+006
.    .    1    trafodion_scan                  T1                    1.00E+006

--- SQL operation complete.
>>

In this example, T1 has 1 million rows. T1, T2 and T3 are simple tables with just two integer columns, A and B, primary key A. I had done UPDATE STATISTICS on t1 after populating it. I salted each table 4 ways on the primary key.

This looks like a nice parallel plan, though I wonder about the need for the sort nodes, since the partitions should line up.

This is a lot of background for my question: Why does the binder forbid such statements? Are there some holes in the handling of nested IUDs that need to be filled before such a feature is turned on?

Thanks,

Dave



RE: Embedding an insert/select inside an insert/select?

Posted by Rohit Jain <ro...@esgyn.com>.
Unless the keys are the same and so is the partitioning, then a sort would be efficient, though does that matter as much for HBase as it does a B-tree?

-----Original Message-----
From: Dave Birdsall [mailto:dave.birdsall@esgyn.com] 
Sent: Wednesday, April 12, 2017 3:24 PM
To: dev@trafodion.incubator.apache.org
Subject: RE: Embedding an insert/select inside an insert/select?

Good thought. I did an experiment. With the 4171 error check commented out, I did:

>>prepare s1 from insert into t1 select * from (insert into t2 select * from t1)
+> as tx;

*** WARNING[6008] Statistics for column (A) from table TRAFODION.SEABASE.T2 were not available. As a result, the access path chosen might not be the best possible.

--- SQL command prepared.
>>explain options 'f' s1;

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
---- ---- ---- --------------------  --------  --------------------  ---------

8    .    9    root                            x                     1.00E+006
7    .    8    esp_exchange                    1:4(hash2)            1.00E+006
5    6    7    tuple_flow                                            1.00E+006
.    .    6    trafodion_insert                T1                    1.00E+000
4    .    5    sort                                                  1.00E+006
2    3    4    nested_join                                           1.00E+006
.    .    3    trafodion_insert                T2                    1.00E+000
1    .    2    sort                                                  1.00E+006
.    .    1    trafodion_scan                  T1                    1.00E+006

--- SQL operation complete.

I'm guessing this would work correctly because the sort operators would block until the scans are done. In fact, that might be why the sort operators are there (I had wondered about that in my original e-mail).

Dave


-----Original Message-----
From: Rohit Jain [mailto:rohit.jain@esgyn.com] 
Sent: Wednesday, April 12, 2017 1:19 PM
To: dev@trafodion.incubator.apache.org
Subject: RE: Embedding an insert/select inside an insert/select?

Maybe because it will have to verify that none of the outer inserts are into the same table(s) as the inner insert(s) so that you don't run into a Halloween problem perhaps?

Rohit

-----Original Message-----
From: Dave Birdsall [mailto:dave.birdsall@esgyn.com] 
Sent: Wednesday, April 12, 2017 2:54 PM
To: dev@trafodion.incubator.apache.org
Subject: Embedding an insert/select inside an insert/select?

Hi,

Trafodion supports a notion of embedded IUD statements. I can, for example, do:

Select * from (insert into t2 select * from t1) as tx where tx.a = 5;

This statement inserts all the rows from t1 into t2, and returns those rows where A = 5 as its result.

Just for fun, I tried the following:

Insert into t3 select * from (insert into t2 select * from t1) as tx where tx.a = 5;

This fails with the following error:

*** ERROR[4171] Embedded INSERT statements are not supported in INSERT statements.

This comes from GenericUpdate::bindNode in optimizer/BindRelExpr.cpp.

Just for fun, I commented out the code that raises this error. I got some reasonable plans. For example:

>>prepare ss from insert into t3 select * from (insert into t2 select * from t1)
+> as tx where tx.a = 5;

--- SQL command prepared.
>>explain options 'f' ss;

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
---- ---- ---- --------------------  --------  --------------------  ---------

8    .    9    root                            x                     5.00E+005
7    .    8    esp_exchange                    1:4(hash2)            5.00E+005
5    6    7    tuple_flow                                            5.00E+005
.    .    6    trafodion_insert                T3                    1.00E+000
4    .    5    sort                                                  5.00E+005
2    3    4    nested_join                                           5.00E+005
.    .    3    trafodion_insert                T2                    9.99E-006
1    .    2    sort                                                  1.00E+006
.    .    1    trafodion_scan                  T1                    1.00E+006

--- SQL operation complete.
>>

In this example, T1 has 1 million rows. T1, T2 and T3 are simple tables with just two integer columns, A and B, primary key A. I had done UPDATE STATISTICS on t1 after populating it. I salted each table 4 ways on the primary key.

This looks like a nice parallel plan, though I wonder about the need for the sort nodes, since the partitions should line up.

This is a lot of background for my question: Why does the binder forbid such statements? Are there some holes in the handling of nested IUDs that need to be filled before such a feature is turned on?

Thanks,

Dave



RE: Embedding an insert/select inside an insert/select?

Posted by Dave Birdsall <da...@esgyn.com>.
Good thought. I did an experiment. With the 4171 error check commented out, I did:

>>prepare s1 from insert into t1 select * from (insert into t2 select * from t1)
+> as tx;

*** WARNING[6008] Statistics for column (A) from table TRAFODION.SEABASE.T2 were not available. As a result, the access path chosen might not be the best possible.

--- SQL command prepared.
>>explain options 'f' s1;

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
---- ---- ---- --------------------  --------  --------------------  ---------

8    .    9    root                            x                     1.00E+006
7    .    8    esp_exchange                    1:4(hash2)            1.00E+006
5    6    7    tuple_flow                                            1.00E+006
.    .    6    trafodion_insert                T1                    1.00E+000
4    .    5    sort                                                  1.00E+006
2    3    4    nested_join                                           1.00E+006
.    .    3    trafodion_insert                T2                    1.00E+000
1    .    2    sort                                                  1.00E+006
.    .    1    trafodion_scan                  T1                    1.00E+006

--- SQL operation complete.

I'm guessing this would work correctly because the sort operators would block until the scans are done. In fact, that might be why the sort operators are there (I had wondered about that in my original e-mail).

Dave


-----Original Message-----
From: Rohit Jain [mailto:rohit.jain@esgyn.com] 
Sent: Wednesday, April 12, 2017 1:19 PM
To: dev@trafodion.incubator.apache.org
Subject: RE: Embedding an insert/select inside an insert/select?

Maybe because it will have to verify that none of the outer inserts are into the same table(s) as the inner insert(s) so that you don't run into a Halloween problem perhaps?

Rohit

-----Original Message-----
From: Dave Birdsall [mailto:dave.birdsall@esgyn.com] 
Sent: Wednesday, April 12, 2017 2:54 PM
To: dev@trafodion.incubator.apache.org
Subject: Embedding an insert/select inside an insert/select?

Hi,

Trafodion supports a notion of embedded IUD statements. I can, for example, do:

Select * from (insert into t2 select * from t1) as tx where tx.a = 5;

This statement inserts all the rows from t1 into t2, and returns those rows where A = 5 as its result.

Just for fun, I tried the following:

Insert into t3 select * from (insert into t2 select * from t1) as tx where tx.a = 5;

This fails with the following error:

*** ERROR[4171] Embedded INSERT statements are not supported in INSERT statements.

This comes from GenericUpdate::bindNode in optimizer/BindRelExpr.cpp.

Just for fun, I commented out the code that raises this error. I got some reasonable plans. For example:

>>prepare ss from insert into t3 select * from (insert into t2 select * from t1)
+> as tx where tx.a = 5;

--- SQL command prepared.
>>explain options 'f' ss;

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
---- ---- ---- --------------------  --------  --------------------  ---------

8    .    9    root                            x                     5.00E+005
7    .    8    esp_exchange                    1:4(hash2)            5.00E+005
5    6    7    tuple_flow                                            5.00E+005
.    .    6    trafodion_insert                T3                    1.00E+000
4    .    5    sort                                                  5.00E+005
2    3    4    nested_join                                           5.00E+005
.    .    3    trafodion_insert                T2                    9.99E-006
1    .    2    sort                                                  1.00E+006
.    .    1    trafodion_scan                  T1                    1.00E+006

--- SQL operation complete.
>>

In this example, T1 has 1 million rows. T1, T2 and T3 are simple tables with just two integer columns, A and B, primary key A. I had done UPDATE STATISTICS on t1 after populating it. I salted each table 4 ways on the primary key.

This looks like a nice parallel plan, though I wonder about the need for the sort nodes, since the partitions should line up.

This is a lot of background for my question: Why does the binder forbid such statements? Are there some holes in the handling of nested IUDs that need to be filled before such a feature is turned on?

Thanks,

Dave



RE: Embedding an insert/select inside an insert/select?

Posted by Rohit Jain <ro...@esgyn.com>.
Maybe because it will have to verify that none of the outer inserts are into the same table(s) as the inner insert(s) so that you don't run into a Halloween problem perhaps?

Rohit

-----Original Message-----
From: Dave Birdsall [mailto:dave.birdsall@esgyn.com] 
Sent: Wednesday, April 12, 2017 2:54 PM
To: dev@trafodion.incubator.apache.org
Subject: Embedding an insert/select inside an insert/select?

Hi,

Trafodion supports a notion of embedded IUD statements. I can, for example, do:

Select * from (insert into t2 select * from t1) as tx where tx.a = 5;

This statement inserts all the rows from t1 into t2, and returns those rows where A = 5 as its result.

Just for fun, I tried the following:

Insert into t3 select * from (insert into t2 select * from t1) as tx where tx.a = 5;

This fails with the following error:

*** ERROR[4171] Embedded INSERT statements are not supported in INSERT statements.

This comes from GenericUpdate::bindNode in optimizer/BindRelExpr.cpp.

Just for fun, I commented out the code that raises this error. I got some reasonable plans. For example:

>>prepare ss from insert into t3 select * from (insert into t2 select * from t1)
+> as tx where tx.a = 5;

--- SQL command prepared.
>>explain options 'f' ss;

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
---- ---- ---- --------------------  --------  --------------------  ---------

8    .    9    root                            x                     5.00E+005
7    .    8    esp_exchange                    1:4(hash2)            5.00E+005
5    6    7    tuple_flow                                            5.00E+005
.    .    6    trafodion_insert                T3                    1.00E+000
4    .    5    sort                                                  5.00E+005
2    3    4    nested_join                                           5.00E+005
.    .    3    trafodion_insert                T2                    9.99E-006
1    .    2    sort                                                  1.00E+006
.    .    1    trafodion_scan                  T1                    1.00E+006

--- SQL operation complete.
>>

In this example, T1 has 1 million rows. T1, T2 and T3 are simple tables with just two integer columns, A and B, primary key A. I had done UPDATE STATISTICS on t1 after populating it. I salted each table 4 ways on the primary key.

This looks like a nice parallel plan, though I wonder about the need for the sort nodes, since the partitions should line up.

This is a lot of background for my question: Why does the binder forbid such statements? Are there some holes in the handling of nested IUDs that need to be filled before such a feature is turned on?

Thanks,

Dave



RE: Embedding an insert/select inside an insert/select?

Posted by Eric Owhadi <er...@esgyn.com>.
That would be a fantastic feature,
Thanks for digging into this Dave!
Eric

-----Original Message-----
From: Dave Birdsall [mailto:dave.birdsall@esgyn.com] 
Sent: Wednesday, April 12, 2017 2:54 PM
To: dev@trafodion.incubator.apache.org
Subject: Embedding an insert/select inside an insert/select?

Hi,

Trafodion supports a notion of embedded IUD statements. I can, for example, do:

Select * from (insert into t2 select * from t1) as tx where tx.a = 5;

This statement inserts all the rows from t1 into t2, and returns those rows where A = 5 as its result.

Just for fun, I tried the following:

Insert into t3 select * from (insert into t2 select * from t1) as tx where tx.a = 5;

This fails with the following error:

*** ERROR[4171] Embedded INSERT statements are not supported in INSERT statements.

This comes from GenericUpdate::bindNode in optimizer/BindRelExpr.cpp.

Just for fun, I commented out the code that raises this error. I got some reasonable plans. For example:

>>prepare ss from insert into t3 select * from (insert into t2 select * from t1)
+> as tx where tx.a = 5;

--- SQL command prepared.
>>explain options 'f' ss;

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
---- ---- ---- --------------------  --------  --------------------  ---------

8    .    9    root                            x                     5.00E+005
7    .    8    esp_exchange                    1:4(hash2)            5.00E+005
5    6    7    tuple_flow                                            5.00E+005
.    .    6    trafodion_insert                T3                    1.00E+000
4    .    5    sort                                                  5.00E+005
2    3    4    nested_join                                           5.00E+005
.    .    3    trafodion_insert                T2                    9.99E-006
1    .    2    sort                                                  1.00E+006
.    .    1    trafodion_scan                  T1                    1.00E+006

--- SQL operation complete.
>>

In this example, T1 has 1 million rows. T1, T2 and T3 are simple tables with just two integer columns, A and B, primary key A. I had done UPDATE STATISTICS on t1 after populating it. I salted each table 4 ways on the primary key.

This looks like a nice parallel plan, though I wonder about the need for the sort nodes, since the partitions should line up.

This is a lot of background for my question: Why does the binder forbid such statements? Are there some holes in the handling of nested IUDs that need to be filled before such a feature is turned on?

Thanks,

Dave