You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucy.apache.org by "Serkan Mulayim (JIRA)" <ji...@apache.org> on 2018/02/01 00:52:00 UTC

[lucy-issues] [jira] [Created] (LUCY-326) C lib: Possible memory leak in SnowStemmer when provided schema for the indexer is not DECREFFED

Serkan Mulayim created LUCY-326:
-----------------------------------

             Summary: C lib: Possible memory leak in SnowStemmer when provided schema for the indexer is not DECREFFED
                 Key: LUCY-326
                 URL: https://issues.apache.org/jira/browse/LUCY-326
             Project: Lucy
          Issue Type: Bug
          Components: C bindings
    Affects Versions: 0.6.1
         Environment: linux
            Reporter: Serkan Mulayim


In my C library I create a static global struct (which contains some runtime variables as well as lucy_Schema pointer) which is created when the program is loaded.  There is also a destroy function which cleans up (also DECREFs the schema) the runtime data. When I index some documents by providing this schema to the indexer, and call destroy function before the program (using the lib) exits, I do not see any memory leaks in the valgrind output. I only see (still reachable has some non-zero values due to lucy_bootstrap_parcel function).

On the other hand if I do not call the destroy function before the exit, I would expect to see only an increase in "still reachable" block in valgrind output, but I also see "possibly lost" as following:

---------------------------------------------------------------------------------------------------

==16942== 70 bytes in 1 blocks are possibly lost in loss record 147 of 178
==16942== at 0x4C29B78: realloc (vg_replace_malloc.c:785)
==16942== by 0x4F86CC4: increase_size (utilities.c:332)
==16942== by 0x4F87865: replace_s (utilities.c:360)
==16942== by 0x4EF4195: SN_set_current (api.c:62)
==16942== by 0x4F44644: sb_stemmer_stem (libstemmer_utf8.c:80)
==16942== by 0x4F65723: LUCY_SnowStemmer_Transform_IMP (SnowballStemmer.c:80)
==16942== by 0x4F4FA69: LUCY_Analyzer_Transform (Analyzer.h:197)
==16942== by 0x4F4FA69: LUCY_PolyAnalyzer_Transform_Text_IMP (PolyAnalyzer.c:110)
==16942== by 0x4F15368: LUCY_Analyzer_Transform_Text (Analyzer.h:204)
==16942== by 0x4F15368: LUCY_Inverter_Add_Field_IMP (Inverter.c:181)
==16942== by 0x4F14E91: LUCY_Inverter_Add_Field (Inverter.h:296)
==16942== by 0x4F14E91: LUCY_Inverter_Invert_Doc_IMP (Inverter.c:109)
==16942== by 0x4F63164: LUCY_Inverter_Invert_Doc (Inverter.h:275)
==16942== by 0x4F63164: LUCY_SegWriter_Add_Doc_IMP (SegWriter.c:109)
==16942== by 0x4F7E069: LUCY_Indexer_Add_Doc (Indexer.h:260)
==16942== by 0x4F7F23F: _symSE_index_messages_json (SymphonySearch.c:432)
==16942==
==16942== LEAK SUMMARY:
==16942== definitely lost: 0 bytes in 0 blocks
==16942== indirectly lost: 0 bytes in 0 blocks
==16942== possibly lost: 70 bytes in 1 blocks
==16942== still reachable: 246,683 bytes in 5,077 blocks
==16942== suppressed: 0 bytes in 0 blocks

---------------------------------------------------------------------------------------------------

Similarly for another program where I do only search (not indexing), I see the similar behaviour. Valgrind output is below for that one:

-----------------------------------------------------------------------------------------------------

==16949==
==16949== HEAP SUMMARY:
==16949== in use at exit: 229,312 bytes in 5,061 blocks
==16949== total heap usage: 34,993 allocs, 29,932 frees, 1,791,083 bytes allocated
==16949==
==16949== 37 bytes in 1 blocks are possibly lost in loss record 96 of 177
==16949== at 0x4C29B78: realloc (vg_replace_malloc.c:785)
==16949== by 0x4F86CC4: increase_size (utilities.c:332)
==16949== by 0x4F87865: replace_s (utilities.c:360)
==16949== by 0x4EF4195: SN_set_current (api.c:62)
==16949== by 0x4F44644: sb_stemmer_stem (libstemmer_utf8.c:80)
==16949== by 0x4F65723: LUCY_SnowStemmer_Transform_IMP (SnowballStemmer.c:80)
==16949== by 0x4F4FA69: LUCY_Analyzer_Transform (Analyzer.h:197)
==16949== by 0x4F4FA69: LUCY_PolyAnalyzer_Transform_Text_IMP (PolyAnalyzer.c:110)
==16949== by 0x4EF35F3: LUCY_Analyzer_Transform_Text (Analyzer.h:204)
==16949== by 0x4EF35F3: LUCY_Analyzer_Split_IMP (Analyzer.c:48)
==16949== by 0x4F5AAC8: LUCY_Analyzer_Split (Analyzer.h:211)
==16949== by 0x4F5AAC8: LUCY_QParser_Expand_Leaf_IMP (QueryParser.c:916)
==16949== by 0x4F59ECA: LUCY_QParser_Expand (QueryParser.h:298)
==16949== by 0x4F59ECA: LUCY_QParser_Parse_IMP (QueryParser.c:207)
==16949== by 0x4F7E358: LUCY_QParser_Parse (QueryParser.h:284)
==16949== by 0x4F7F492: _symSE_get_query (SymphonySearch.c:483)
==16949==
==16949== LEAK SUMMARY:
==16949== definitely lost: 0 bytes in 0 blocks
==16949== indirectly lost: 0 bytes in 0 blocks
==16949== possibly lost: 37 bytes in 1 blocks
==16949== still reachable: 229,275 bytes in 5,060 blocks
==16949== suppressed: 0 bytes in 0 blocks
==16949== Reachable blocks (those to which a pointer was found) are not shown.
==16949== To see them, rerun with: --leak-check=full --show-leak-kinds=all

----------------------------------------------------------------------------------------------------

*If I remove the SnowStemmer from the Analyzers, I see that this issue does not happen( and I only see still reachable is non-zero)*

 

 

 

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)