You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Jeffrey Lichtman <sw...@rcn.com> on 2005/10/20 21:10:32 UTC

Stack overflow from materialized subquery

I have figured out what's behind the stack overflow reported by 
Daniel Skiles in the derby-user list. To summarize, he gets a stack 
overflow from this query:

     SELECT path from filsystemfiles where
         path not in (select path from existingfiles)

The stack trace shows a large number of UnionResultSets calling 
openCore(). The odd thing, of course, is that the query doesn't 
contain any unions.

It turns out that, at some point, an optimization was added to 
subquery processing to materialize the subquery result in memory in 
certain cases. According to the comments, the implementor of this 
change was Tingjian Ge. The comments also refer to "beetle 4373", 
which I assume is a cute way of referring to a bug number.

The in-memory materialization works by reading the rows into memory 
and constructing a set of nested UnionResultSets on the fly. This 
happens in BaseActivation.java in the language execution code, in the 
method materializeResultSetIfPossible(). This code tries to ensure 
that the rows will fit in memory, but it doesn't take the possibility 
of a stack overflow as a result of nested unions into account.

The obvious solution is to get rid of the nested union technique in 
materializeResultSetIfPossible(), and to replace it with an iterative 
technique (as opposed to a recursive one). This should be pretty 
easy, as the method already puts the rows into a Vector before 
creating the unions.

I should say, however, that some aspects of this optimization bother 
me. The biggest problem is that it could make some queries run much 
slower. Consider the following:

     select * from one_row_table where column1 not in
         (select column2 from million_row_table)

The "unoptimized" query will read the single row from one_row_table 
and do a probe into million_row_table to see if it finds a match. If 
there is an index on million_row_table.column2, it will return a 
result very quickly despite the large size of million_row_table.

With the materialization "optimization" it will read all of 
million_row_table into memory, even if there's an index on column2. 
This is obviously a big performance hit.

Since subquery materialization of this type can be expensive, the 
decision as to whether to do it should be based on query optimizer 
cost estimates. The optimizer knows what the cost is of materializing 
a subquery, and also what the cost is of getting a row from an 
unmaterialized subquery.

I also wonder about the wisdom of having the language cache partial 
query results in memory. Although this can help performance in some 
cases, it also chews up memory. This is different from a limited-size 
cache with a backing store (like what the store uses for page 
caching). The language has no way to limit the total amount of memory 
used in this type of processing.

I also have some problems with the way this optimization was 
implemented. The decision to materialize the subquery results in 
memory is made during code generation - all such decisions should be 
made during the compilation phase. It's not clear to me why 
materializeResultSetIfPossible() is in BaseActivation - I would 
expect the of materialization to be done by a type of ResultSet, not 
by a method in BaseActivation. Also, this method calls 
getMaxMemoryPerTable() in the OptimizerFactory - nothing in the 
execution code should refer to anything in the compilation code 
(perhaps getMaxMemoryPerTable() should be moved somewhere else).

Thoughts, anyone?


                        -        Jeff Lichtman
                                 swazoo@rcn.com
                                 Check out Swazoo Koolak's Web Jukebox at
                                 http://swazoo.com/ 


Re: Stack overflow from materialized subquery

Posted by Jeffrey Lichtman <sw...@rcn.com>.
>It looks like you have identified a number of serious defects here. 
>Your analysis seems solid to me and I doubt anyone is going to argue 
>with you. Could you log a bug so we can fix these issues hopefully 
>over the coming months?
>
>Thanks,
>-Rick

OK. This is now DERBY-634.


                        -        Jeff Lichtman
                                 swazoo@rcn.com
                                 Check out Swazoo Koolak's Web Jukebox at
                                 http://swazoo.com/ 


Re: Stack overflow from materialized subquery

Posted by Rick Hillegas <Ri...@Sun.COM>.
Hi Jeff,

It looks like you have identified a number of serious defects here. Your 
analysis seems solid to me and I doubt anyone is going to argue with 
you. Could you log a bug so we can fix these issues hopefully over the 
coming months?

Thanks,
-Rick

Jeffrey Lichtman wrote:

> I have figured out what's behind the stack overflow reported by Daniel 
> Skiles in the derby-user list. To summarize, he gets a stack overflow 
> from this query:
>
>     SELECT path from filsystemfiles where
>         path not in (select path from existingfiles)
>
> The stack trace shows a large number of UnionResultSets calling 
> openCore(). The odd thing, of course, is that the query doesn't 
> contain any unions.
>
> It turns out that, at some point, an optimization was added to 
> subquery processing to materialize the subquery result in memory in 
> certain cases. According to the comments, the implementor of this 
> change was Tingjian Ge. The comments also refer to "beetle 4373", 
> which I assume is a cute way of referring to a bug number.
>
> The in-memory materialization works by reading the rows into memory 
> and constructing a set of nested UnionResultSets on the fly. This 
> happens in BaseActivation.java in the language execution code, in the 
> method materializeResultSetIfPossible(). This code tries to ensure 
> that the rows will fit in memory, but it doesn't take the possibility 
> of a stack overflow as a result of nested unions into account.
>
> The obvious solution is to get rid of the nested union technique in 
> materializeResultSetIfPossible(), and to replace it with an iterative 
> technique (as opposed to a recursive one). This should be pretty easy, 
> as the method already puts the rows into a Vector before creating the 
> unions.
>
> I should say, however, that some aspects of this optimization bother 
> me. The biggest problem is that it could make some queries run much 
> slower. Consider the following:
>
>     select * from one_row_table where column1 not in
>         (select column2 from million_row_table)
>
> The "unoptimized" query will read the single row from one_row_table 
> and do a probe into million_row_table to see if it finds a match. If 
> there is an index on million_row_table.column2, it will return a 
> result very quickly despite the large size of million_row_table.
>
> With the materialization "optimization" it will read all of 
> million_row_table into memory, even if there's an index on column2. 
> This is obviously a big performance hit.
>
> Since subquery materialization of this type can be expensive, the 
> decision as to whether to do it should be based on query optimizer 
> cost estimates. The optimizer knows what the cost is of materializing 
> a subquery, and also what the cost is of getting a row from an 
> unmaterialized subquery.
>
> I also wonder about the wisdom of having the language cache partial 
> query results in memory. Although this can help performance in some 
> cases, it also chews up memory. This is different from a limited-size 
> cache with a backing store (like what the store uses for page 
> caching). The language has no way to limit the total amount of memory 
> used in this type of processing.
>
> I also have some problems with the way this optimization was 
> implemented. The decision to materialize the subquery results in 
> memory is made during code generation - all such decisions should be 
> made during the compilation phase. It's not clear to me why 
> materializeResultSetIfPossible() is in BaseActivation - I would expect 
> the of materialization to be done by a type of ResultSet, not by a 
> method in BaseActivation. Also, this method calls 
> getMaxMemoryPerTable() in the OptimizerFactory - nothing in the 
> execution code should refer to anything in the compilation code 
> (perhaps getMaxMemoryPerTable() should be moved somewhere else).
>
> Thoughts, anyone?
>
>
>                        -        Jeff Lichtman
>                                 swazoo@rcn.com
>                                 Check out Swazoo Koolak's Web Jukebox at
>                                 http://swazoo.com/



Re: Stack overflow from materialized subquery

Posted by Satheesh Bandaram <sa...@Sourcery.Org>.
Thanks Jeff, great analysis... It seems to me it was originally intended
to cache a small number of subquery result rows, but the code didn't
seem to implement it that way. Implementing the "optimization" at
runtime was probably chosen to make sure only small number of rows are
returned by the subquery. Like you said, using nested unions seems like
a bad idea too.

Any ideas on how to unoptimize without loosing performance gain some
queries might be getting now? It seems the "optimization" was done for a
specific customer query that is supposedly improved performance by 100
times...

Satheesh

Jeffrey Lichtman wrote:

> I have figured out what's behind the stack overflow reported by Daniel
> Skiles in the derby-user list. To summarize, he gets a stack overflow
> from this query:
>
>     SELECT path from filsystemfiles where
>         path not in (select path from existingfiles)
>
> The stack trace shows a large number of UnionResultSets calling
> openCore(). The odd thing, of course, is that the query doesn't
> contain any unions.
>
> It turns out that, at some point, an optimization was added to
> subquery processing to materialize the subquery result in memory in
> certain cases. According to the comments, the implementor of this
> change was Tingjian Ge. The comments also refer to "beetle 4373",
> which I assume is a cute way of referring to a bug number.
>
> The in-memory materialization works by reading the rows into memory
> and constructing a set of nested UnionResultSets on the fly. This
> happens in BaseActivation.java in the language execution code, in the
> method materializeResultSetIfPossible(). This code tries to ensure
> that the rows will fit in memory, but it doesn't take the possibility
> of a stack overflow as a result of nested unions into account.
>
> The obvious solution is to get rid of the nested union technique in
> materializeResultSetIfPossible(), and to replace it with an iterative
> technique (as opposed to a recursive one). This should be pretty easy,
> as the method already puts the rows into a Vector before creating the
> unions.
>
> I should say, however, that some aspects of this optimization bother
> me. The biggest problem is that it could make some queries run much
> slower. Consider the following:
>
>     select * from one_row_table where column1 not in
>         (select column2 from million_row_table)
>
> The "unoptimized" query will read the single row from one_row_table
> and do a probe into million_row_table to see if it finds a match. If
> there is an index on million_row_table.column2, it will return a
> result very quickly despite the large size of million_row_table.
>
> With the materialization "optimization" it will read all of
> million_row_table into memory, even if there's an index on column2.
> This is obviously a big performance hit.
>
> Since subquery materialization of this type can be expensive, the
> decision as to whether to do it should be based on query optimizer
> cost estimates. The optimizer knows what the cost is of materializing
> a subquery, and also what the cost is of getting a row from an
> unmaterialized subquery.
>
> I also wonder about the wisdom of having the language cache partial
> query results in memory. Although this can help performance in some
> cases, it also chews up memory. This is different from a limited-size
> cache with a backing store (like what the store uses for page
> caching). The language has no way to limit the total amount of memory
> used in this type of processing.
>
> I also have some problems with the way this optimization was
> implemented. The decision to materialize the subquery results in
> memory is made during code generation - all such decisions should be
> made during the compilation phase. It's not clear to me why
> materializeResultSetIfPossible() is in BaseActivation - I would expect
> the of materialization to be done by a type of ResultSet, not by a
> method in BaseActivation. Also, this method calls
> getMaxMemoryPerTable() in the OptimizerFactory - nothing in the
> execution code should refer to anything in the compilation code
> (perhaps getMaxMemoryPerTable() should be moved somewhere else).
>
> Thoughts, anyone?
>
>
>                        -        Jeff Lichtman
>                                 swazoo@rcn.com
>                                 Check out Swazoo Koolak's Web Jukebox at
>                                 http://swazoo.com/
>
>