You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Shuyi Chen <su...@gmail.com> on 2016/10/03 07:25:51 UTC

Re: Non heap memory leak in Calcite?

I think I might have a theory for the non-heap memory leak I observed.
Calcite uses code generation when working with adapters. For each issued
query, code will be generated at runtime and get compiled and loaded.
However, the JIT compiler data and runtime class loaded info are all stored
in JVM's non-heap area which rarely get gc unless they get unloaded
explicitly. Therefore, it causes non-heap memory explosion as queries come
in over time. I've verified the leak using jstat on both cassandra schema
and reflective schema.

Could you please let me know if I am correct and how we can fix the leak?
Thanks a lot.

Shuyi

On Thu, Aug 25, 2016 at 4:02 PM, Julian Hyde <jh...@apache.org> wrote:

> Calcite doesn’t use off-heap memory (no native code, no direct
> ByteBuffers, no Unsafe). So if there are off-heap leaks, look to Calcite’s
> libraries (e.g. the Cassandra driver) or to Calcite adapters’ use of those
> libraries.
>
> Julian
>
>
> > On Aug 25, 2016, at 3:45 PM, Shuyi Chen <su...@gmail.com> wrote:
> >
> > Hi all, I was using Calcite to query data in Cassandra. I found that
> there
> > is memory leak in non-heap space when doing calcite queries. Has someone
> > spot such issues? Below is an example code snippet I used. Thanks a lot.
> >
> > Statement statement = someCalciteConnection.createStatement();
> >
> > ResultSet resultSet = statement.executeQuery(query);
> > final JSONArray jsonArray = JSONUtil.convertResultSetIntoJSON(
> resultSet);
> > statement.close();
> >
> >
> > Shuyi
> >
> > --
> > "So you have to trust that the dots will somehow connect in your future."
>
>


-- 
"So you have to trust that the dots will somehow connect in your future."

Re: Non heap memory leak in Calcite?

Posted by Julian Hyde <jh...@apache.org>.
Can you please log a JIRA case for this? If you can create a PR with a test case, even better.

> On Oct 6, 2016, at 11:14 AM, Shuyi Chen <su...@gmail.com> wrote:
> 
> Hi Julian,
> 
> 
> This is how I use calcite and get non-heap memory leak. Could you please
> take a look if I missed something?
> 
> 
> Statement statement =
> cassandraConnectionFactory.getCalciteConnection().createStatement();
> ResultSet resultSet = statement.executeQuery(query);
> final JSONArray jsonArray = JSONUtil.*convertResultSetIntoJSON*(resultSet);
> statement.close();
> 
> 
> Also, to reproduce the problem, you can follow this post (
> https://martinsdeveloperworld.wordpress.com/2015/08/24/apache-calcite-setting-up-your-own-in-memory-database-with-sql-interface/)
> to reproduce the problem easily on ReflectiveSchema by increasing the
> iteration count, and use jstat to observe.
> 
> 
> Thanks a lot.
> 
> Shuyi
> 
> On Mon, Oct 3, 2016 at 1:17 PM, Julian Hyde <jh...@apache.org> wrote:
> 
>> I don’t know how the JVM stores JIT information but the most logical place
>> to store it would be alongside the loaded class. If that is the case, if we
>> are correctly using class loaders and are discarding classes when we no
>> longer need them — and I believe we are — then we shouldn’t have a resource
>> leak.
>> 
>> Vladimir, As our resident expert on all things JVM, would you like to
>> comment?
>> 
>> Shuyi, Is it possible that you have a statement leak? I.e. you hold
>> statements open after you have finished with them. If so, Calcite would be
>> unable to free up generated classes and their associated data in the JVM,
>> and I suppose that might give the symptoms you are seeing.
>> 
>> Julian
>> 
>>> On Oct 3, 2016, at 12:25 AM, Shuyi Chen <su...@gmail.com> wrote:
>>> 
>>> I think I might have a theory for the non-heap memory leak I observed.
>>> Calcite uses code generation when working with adapters. For each issued
>>> query, code will be generated at runtime and get compiled and loaded.
>>> However, the JIT compiler data and runtime class loaded info are all
>> stored
>>> in JVM's non-heap area which rarely get gc unless they get unloaded
>>> explicitly. Therefore, it causes non-heap memory explosion as queries
>> come
>>> in over time. I've verified the leak using jstat on both cassandra schema
>>> and reflective schema.
>>> 
>>> Could you please let me know if I am correct and how we can fix the leak?
>>> Thanks a lot.
>>> 
>>> Shuyi
>>> 
>>> On Thu, Aug 25, 2016 at 4:02 PM, Julian Hyde <jh...@apache.org> wrote:
>>> 
>>>> Calcite doesn’t use off-heap memory (no native code, no direct
>>>> ByteBuffers, no Unsafe). So if there are off-heap leaks, look to
>> Calcite’s
>>>> libraries (e.g. the Cassandra driver) or to Calcite adapters’ use of
>> those
>>>> libraries.
>>>> 
>>>> Julian
>>>> 
>>>> 
>>>>> On Aug 25, 2016, at 3:45 PM, Shuyi Chen <su...@gmail.com> wrote:
>>>>> 
>>>>> Hi all, I was using Calcite to query data in Cassandra. I found that
>>>> there
>>>>> is memory leak in non-heap space when doing calcite queries. Has
>> someone
>>>>> spot such issues? Below is an example code snippet I used. Thanks a
>> lot.
>>>>> 
>>>>> Statement statement = someCalciteConnection.createStatement();
>>>>> 
>>>>> ResultSet resultSet = statement.executeQuery(query);
>>>>> final JSONArray jsonArray = JSONUtil.convertResultSetIntoJSON(
>>>> resultSet);
>>>>> statement.close();
>>>>> 
>>>>> 
>>>>> Shuyi
>>>>> 
>>>>> --
>>>>> "So you have to trust that the dots will somehow connect in your
>> future."
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> "So you have to trust that the dots will somehow connect in your future."
>> 
>> 
> 
> 
> -- 
> "So you have to trust that the dots will somehow connect in your future."


Re: Non heap memory leak in Calcite?

Posted by Shuyi Chen <su...@gmail.com>.
Hi Julian,


This is how I use calcite and get non-heap memory leak. Could you please
take a look if I missed something?


Statement statement =
cassandraConnectionFactory.getCalciteConnection().createStatement();
ResultSet resultSet = statement.executeQuery(query);
final JSONArray jsonArray = JSONUtil.*convertResultSetIntoJSON*(resultSet);
statement.close();


Also, to reproduce the problem, you can follow this post (
https://martinsdeveloperworld.wordpress.com/2015/08/24/apache-calcite-setting-up-your-own-in-memory-database-with-sql-interface/)
to reproduce the problem easily on ReflectiveSchema by increasing the
iteration count, and use jstat to observe.


Thanks a lot.

Shuyi

On Mon, Oct 3, 2016 at 1:17 PM, Julian Hyde <jh...@apache.org> wrote:

> I don’t know how the JVM stores JIT information but the most logical place
> to store it would be alongside the loaded class. If that is the case, if we
> are correctly using class loaders and are discarding classes when we no
> longer need them — and I believe we are — then we shouldn’t have a resource
> leak.
>
> Vladimir, As our resident expert on all things JVM, would you like to
> comment?
>
> Shuyi, Is it possible that you have a statement leak? I.e. you hold
> statements open after you have finished with them. If so, Calcite would be
> unable to free up generated classes and their associated data in the JVM,
> and I suppose that might give the symptoms you are seeing.
>
> Julian
>
> > On Oct 3, 2016, at 12:25 AM, Shuyi Chen <su...@gmail.com> wrote:
> >
> > I think I might have a theory for the non-heap memory leak I observed.
> > Calcite uses code generation when working with adapters. For each issued
> > query, code will be generated at runtime and get compiled and loaded.
> > However, the JIT compiler data and runtime class loaded info are all
> stored
> > in JVM's non-heap area which rarely get gc unless they get unloaded
> > explicitly. Therefore, it causes non-heap memory explosion as queries
> come
> > in over time. I've verified the leak using jstat on both cassandra schema
> > and reflective schema.
> >
> > Could you please let me know if I am correct and how we can fix the leak?
> > Thanks a lot.
> >
> > Shuyi
> >
> > On Thu, Aug 25, 2016 at 4:02 PM, Julian Hyde <jh...@apache.org> wrote:
> >
> >> Calcite doesn’t use off-heap memory (no native code, no direct
> >> ByteBuffers, no Unsafe). So if there are off-heap leaks, look to
> Calcite’s
> >> libraries (e.g. the Cassandra driver) or to Calcite adapters’ use of
> those
> >> libraries.
> >>
> >> Julian
> >>
> >>
> >>> On Aug 25, 2016, at 3:45 PM, Shuyi Chen <su...@gmail.com> wrote:
> >>>
> >>> Hi all, I was using Calcite to query data in Cassandra. I found that
> >> there
> >>> is memory leak in non-heap space when doing calcite queries. Has
> someone
> >>> spot such issues? Below is an example code snippet I used. Thanks a
> lot.
> >>>
> >>> Statement statement = someCalciteConnection.createStatement();
> >>>
> >>> ResultSet resultSet = statement.executeQuery(query);
> >>> final JSONArray jsonArray = JSONUtil.convertResultSetIntoJSON(
> >> resultSet);
> >>> statement.close();
> >>>
> >>>
> >>> Shuyi
> >>>
> >>> --
> >>> "So you have to trust that the dots will somehow connect in your
> future."
> >>
> >>
> >
> >
> > --
> > "So you have to trust that the dots will somehow connect in your future."
>
>


-- 
"So you have to trust that the dots will somehow connect in your future."

Re: Non heap memory leak in Calcite?

Posted by Julian Hyde <jh...@apache.org>.
I don’t know how the JVM stores JIT information but the most logical place to store it would be alongside the loaded class. If that is the case, if we are correctly using class loaders and are discarding classes when we no longer need them — and I believe we are — then we shouldn’t have a resource leak.

Vladimir, As our resident expert on all things JVM, would you like to comment?

Shuyi, Is it possible that you have a statement leak? I.e. you hold statements open after you have finished with them. If so, Calcite would be unable to free up generated classes and their associated data in the JVM, and I suppose that might give the symptoms you are seeing.

Julian

> On Oct 3, 2016, at 12:25 AM, Shuyi Chen <su...@gmail.com> wrote:
> 
> I think I might have a theory for the non-heap memory leak I observed.
> Calcite uses code generation when working with adapters. For each issued
> query, code will be generated at runtime and get compiled and loaded.
> However, the JIT compiler data and runtime class loaded info are all stored
> in JVM's non-heap area which rarely get gc unless they get unloaded
> explicitly. Therefore, it causes non-heap memory explosion as queries come
> in over time. I've verified the leak using jstat on both cassandra schema
> and reflective schema.
> 
> Could you please let me know if I am correct and how we can fix the leak?
> Thanks a lot.
> 
> Shuyi
> 
> On Thu, Aug 25, 2016 at 4:02 PM, Julian Hyde <jh...@apache.org> wrote:
> 
>> Calcite doesn’t use off-heap memory (no native code, no direct
>> ByteBuffers, no Unsafe). So if there are off-heap leaks, look to Calcite’s
>> libraries (e.g. the Cassandra driver) or to Calcite adapters’ use of those
>> libraries.
>> 
>> Julian
>> 
>> 
>>> On Aug 25, 2016, at 3:45 PM, Shuyi Chen <su...@gmail.com> wrote:
>>> 
>>> Hi all, I was using Calcite to query data in Cassandra. I found that
>> there
>>> is memory leak in non-heap space when doing calcite queries. Has someone
>>> spot such issues? Below is an example code snippet I used. Thanks a lot.
>>> 
>>> Statement statement = someCalciteConnection.createStatement();
>>> 
>>> ResultSet resultSet = statement.executeQuery(query);
>>> final JSONArray jsonArray = JSONUtil.convertResultSetIntoJSON(
>> resultSet);
>>> statement.close();
>>> 
>>> 
>>> Shuyi
>>> 
>>> --
>>> "So you have to trust that the dots will somehow connect in your future."
>> 
>> 
> 
> 
> -- 
> "So you have to trust that the dots will somehow connect in your future."