You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Jakov Varenina <ja...@est.tech> on 2021/11/24 19:13:01 UTC

Question related to orphaned .drf files in disk-store

Hi devs,

We have noticed that disk-store folder can contain orphaned .drf files 
(only .drf file without accompanying .crf and .krf file with the same 
id). Also, we have noticed that these "orphaned" .drf are stored in heap 
(drfOnlyOplogs hash map with size 7680 in below picture):

Could you please tell us why do geode after compaction sometimes only 
keep .drf and deletes the .crf and .krf files? Why do geode need those 
orphaned .drf files?

BRs/Jakov


Re: Question related to orphaned .drf files in disk-store

Posted by Anthony Baker <ba...@vmware.com>.
Got it, that seems entirely reasonable.

Anthony


> On Dec 3, 2021, at 7:37 AM, Jakov Varenina <ja...@est.tech> wrote:
> 
> Hi Anthony,
> 
> Not sure normally, but at the moment when we were investigating the issue there were 21 .crf files in disk-store (on one server) with default max-oplog-size (1GB) and compaction-threshold (50%).
> 
> BRs/Jakov
> 
> On 02. 12. 2021. 17:06, Anthony Baker wrote:
>> Related but different question:  how many active oplogs do you normally see at one time?  You may want to adjust the max-oplog-size if the default of 1 GB is too small.
>> 


Re: Question related to orphaned .drf files in disk-store

Posted by Jakov Varenina <ja...@est.tech>.
Hi Anthony,

Not sure normally, but at the moment when we were investigating the 
issue there were 21 .crf files in disk-store (on one server) with 
default max-oplog-size (1GB) and compaction-threshold (50%).

BRs/Jakov

On 02. 12. 2021. 17:06, Anthony Baker wrote:
> Related but different question:  how many active oplogs do you normally see at one time?  You may want to adjust the max-oplog-size if the default of 1 GB is too small.
>
> On Dec 2, 2021, at 1:11 AM, Jakov Varenina <ja...@est.tech>> wrote:
>
> Hi Dan,
>
> We forget to mention that we actually configure off-heap for the regions, so cache entry values are stored outside the heap memory. Only Oplog objects that are not compacted and that have .crf file are referencing the live entries from the cache. These Oplog objects are not stored in onlyDrfOplogs hashmap. In onlyDrfOplogs map are only Oplog objects that are representing orphaned .drf files (the one without accompanying .crf and .krf file). These objects have been compacted and doesn't contain a reference to any live entry from the cache. All of these 18G is actually occupied by empty pendingKrfTags hashmaps.
>
> In this case there are 7680 Oplog objects stored in onlyDrfOplogs. Every Oplog object has it's own regionMap hashmap. Every regionMap can contain hundreds of empty pendingKrfTags hashMaps. When you bring all that together you get more then 18G of unnecessary heap memory.
>
> Thank you all for quick review of PR and fast response to our questions!
>
> BRs/Jakov
>
>
> On 02. 12. 2021. 00:25, Dan Smith wrote:
> Interesting. It does look like that pendingKrfTags structure is wasting memory.
>
> I think that retained heap of 20 gigabytes might include your entire cache, because those objects have references back to the Cache object. However with 6K oplogs each having an empty map with 4K elements that does add up.
>
> -Dan
> ------------------------------------------------------------------------
> *From:* Jakov Varenina <ja...@est.tech>>
> *Sent:* Tuesday, November 30, 2021 5:53 AM
> *To:* dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>
> *Subject:* Re: Question related to orphaned .drf files in disk-store
>
> Hi Dan and all,
>
>
> Just to provide you the additional picture that better represents the severity of the problem with pendingKrfsTag. So when after you check the second picture in below mail, then please come back and check this one also. Here you can see that the pendingKerfTags is empty and has capacity of 9,192 allocated in memory.
>
>
>
> Sorry for any inconvenience.
>
> BRs/Jakov
>
>
> On 30. 11. 2021. 09:32, Jakov Varenina wrote:
>
> Hi Dan,
>
>
> Thank you for your answer!
>
>
> We have identify memory leak in Oplog objects that are representing orphaned .drf files in heap memory. In below screenshoot you can see that 7680 onlyDrfOplogs consume more than 18 GB of heap which doesn't seem correct.
>
>
>
> In below picture you can see that the drfOnlyPlog.Oplog.regionMap.pendingKrfTgs structure is responsible for more then 95% of drfOnlyOplogs heap memory.
>
>
>
>
> The pendingKrfTags structure is actually empty (although it consumes memory because it was used previously and the size of the HashMap was not reduced) and not used by the onlyDrfOplogs objects. Additionally, the regionMap.liveEntries linked list has just one element (fake disk entry OlogDiskEntry indicating that it is empty) and it is also not used. You can find more details why pedingKrfTags sturcture remianed in memory for Oplog object representing orphaned .drf file and possible solution in the following ticket and the PR:
>
>
> https://issues.apache.org/jira/browse/GEODE-9854 <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-9854&amp;data=04%7C01%7Cbakera%40vmware.com%7Cbd2758bd10a54592fcef08d9b573cebe%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637740331264764503%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=hEhzZk%2FZbFH%2B8812D7iRIU9ywNdV5CyW752HyvU3Tgo%3D&amp;reserved=0>
>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fpull%2F7145&amp;data=04%7C01%7Cbakera%40vmware.com%7Cbd2758bd10a54592fcef08d9b573cebe%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637740331264764503%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Ua49Y%2F4PwKhQHHgz898uxDLde%2BZpZZFxMBY%2FgIL8%2BEE%3D&amp;reserved=0 <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fpull%2F7145&amp;data=04%7C01%7Cbakera%40vmware.com%7Cbd2758bd10a54592fcef08d9b573cebe%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637740331264764503%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Ua49Y%2F4PwKhQHHgz898uxDLde%2BZpZZFxMBY%2FgIL8%2BEE%3D&amp;reserved=0>
>
>
> BRs/Jakov
>
>
>
> On 24. 11. 2021. 23:12, Dan Smith wrote:
> The .drf file contains destroy records for entries in any older oplog. So even if the corresponding .crf file has been deleted, the .drf file with the same number still needs to be retained until the older .crf files are all deleted.
>
> 7680 does seem like a lot of oplogs. That data structure is just references to the files themselves, I don't think we are keeping the contents of the .drf files in memory, except during recovery time.
>
> -Dan
> ------------------------------------------------------------------------
> *From:* Jakov Varenina <ja...@est.tech>> <ma...@est.tech>
> *Sent:* Wednesday, November 24, 2021 11:13 AM
> *To:* dev@geode.apache.org<ma...@geode.apache.org> <ma...@geode.apache.org> <de...@geode.apache.org>> <ma...@geode.apache.org>
> *Subject:* Question related to orphaned .drf files in disk-store
> Hi devs,
>
> We have noticed that disk-store folder can contain orphaned .drf files (only .drf file without accompanying .crf and .krf file with the same id). Also, we have noticed that these "orphaned" .drf are stored in heap (drfOnlyOplogs hash map with size 7680 in below picture):
>
> Could you please tell us why do geode after compaction sometimes only keep .drf and deletes the .crf and .krf files? Why do geode need those orphaned .drf files?
>
> BRs/Jakov
>
>

Re: Question related to orphaned .drf files in disk-store

Posted by Anthony Baker <ba...@vmware.com>.
Related but different question:  how many active oplogs do you normally see at one time?  You may want to adjust the max-oplog-size if the default of 1 GB is too small.

On Dec 2, 2021, at 1:11 AM, Jakov Varenina <ja...@est.tech>> wrote:

Hi Dan,

We forget to mention that we actually configure off-heap for the regions, so cache entry values are stored outside the heap memory. Only Oplog objects that are not compacted and that have .crf file are referencing the live entries from the cache. These Oplog objects are not stored in onlyDrfOplogs hashmap. In onlyDrfOplogs map are only Oplog objects that are representing orphaned .drf files (the one without accompanying .crf and .krf file). These objects have been compacted and doesn't contain a reference to any live entry from the cache. All of these 18G is actually occupied by empty pendingKrfTags hashmaps.

In this case there are 7680 Oplog objects stored in onlyDrfOplogs. Every Oplog object has it's own regionMap hashmap. Every regionMap can contain hundreds of empty pendingKrfTags hashMaps. When you bring all that together you get more then 18G of unnecessary heap memory.

Thank you all for quick review of PR and fast response to our questions!

BRs/Jakov


On 02. 12. 2021. 00:25, Dan Smith wrote:
Interesting. It does look like that pendingKrfTags structure is wasting memory.

I think that retained heap of 20 gigabytes might include your entire cache, because those objects have references back to the Cache object. However with 6K oplogs each having an empty map with 4K elements that does add up.

-Dan
------------------------------------------------------------------------
*From:* Jakov Varenina <ja...@est.tech>>
*Sent:* Tuesday, November 30, 2021 5:53 AM
*To:* dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>>
*Subject:* Re: Question related to orphaned .drf files in disk-store

Hi Dan and all,


Just to provide you the additional picture that better represents the severity of the problem with pendingKrfsTag. So when after you check the second picture in below mail, then please come back and check this one also. Here you can see that the pendingKerfTags is empty and has capacity of 9,192 allocated in memory.



Sorry for any inconvenience.

BRs/Jakov


On 30. 11. 2021. 09:32, Jakov Varenina wrote:

Hi Dan,


Thank you for your answer!


We have identify memory leak in Oplog objects that are representing orphaned .drf files in heap memory. In below screenshoot you can see that 7680 onlyDrfOplogs consume more than 18 GB of heap which doesn't seem correct.



In below picture you can see that the drfOnlyPlog.Oplog.regionMap.pendingKrfTgs structure is responsible for more then 95% of drfOnlyOplogs heap memory.




The pendingKrfTags structure is actually empty (although it consumes memory because it was used previously and the size of the HashMap was not reduced) and not used by the onlyDrfOplogs objects. Additionally, the regionMap.liveEntries linked list has just one element (fake disk entry OlogDiskEntry indicating that it is empty) and it is also not used. You can find more details why pedingKrfTags sturcture remianed in memory for Oplog object representing orphaned .drf file and possible solution in the following ticket and the PR:


https://issues.apache.org/jira/browse/GEODE-9854 <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-9854&amp;data=04%7C01%7Cbakera%40vmware.com%7Cbd2758bd10a54592fcef08d9b573cebe%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637740331264764503%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=hEhzZk%2FZbFH%2B8812D7iRIU9ywNdV5CyW752HyvU3Tgo%3D&amp;reserved=0>

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fpull%2F7145&amp;data=04%7C01%7Cbakera%40vmware.com%7Cbd2758bd10a54592fcef08d9b573cebe%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637740331264764503%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Ua49Y%2F4PwKhQHHgz898uxDLde%2BZpZZFxMBY%2FgIL8%2BEE%3D&amp;reserved=0 <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fpull%2F7145&amp;data=04%7C01%7Cbakera%40vmware.com%7Cbd2758bd10a54592fcef08d9b573cebe%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637740331264764503%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Ua49Y%2F4PwKhQHHgz898uxDLde%2BZpZZFxMBY%2FgIL8%2BEE%3D&amp;reserved=0>


BRs/Jakov



On 24. 11. 2021. 23:12, Dan Smith wrote:
The .drf file contains destroy records for entries in any older oplog. So even if the corresponding .crf file has been deleted, the .drf file with the same number still needs to be retained until the older .crf files are all deleted.

7680 does seem like a lot of oplogs. That data structure is just references to the files themselves, I don't think we are keeping the contents of the .drf files in memory, except during recovery time.

-Dan
------------------------------------------------------------------------
*From:* Jakov Varenina <ja...@est.tech>> <ma...@est.tech>
*Sent:* Wednesday, November 24, 2021 11:13 AM
*To:* dev@geode.apache.org<ma...@geode.apache.org> <ma...@geode.apache.org> <de...@geode.apache.org>> <ma...@geode.apache.org>
*Subject:* Question related to orphaned .drf files in disk-store
Hi devs,

We have noticed that disk-store folder can contain orphaned .drf files (only .drf file without accompanying .crf and .krf file with the same id). Also, we have noticed that these "orphaned" .drf are stored in heap (drfOnlyOplogs hash map with size 7680 in below picture):

Could you please tell us why do geode after compaction sometimes only keep .drf and deletes the .crf and .krf files? Why do geode need those orphaned .drf files?

BRs/Jakov


Re: Question related to orphaned .drf files in disk-store

Posted by Jakov Varenina <ja...@est.tech>.
Hi Dan,

We forget to mention that we actually configure off-heap for the 
regions, so cache entry values are stored outside the heap memory. Only 
Oplog objects that are not compacted and that have .crf file are 
referencing the live entries from the cache. These Oplog objects are not 
stored in onlyDrfOplogs hashmap. In onlyDrfOplogs map are only Oplog 
objects that are representing orphaned .drf files (the one without 
accompanying .crf and .krf file). These objects have been compacted and 
doesn't contain a reference to any live entry from the cache. All of 
these 18G is actually occupied by empty pendingKrfTags hashmaps.

In this case there are 7680 Oplog objects stored in onlyDrfOplogs. Every 
Oplog object has it's own regionMap hashmap. Every regionMap can contain 
hundreds of empty pendingKrfTags hashMaps. When you bring all that 
together you get more then 18G of unnecessary heap memory.

Thank you all for quick review of PR and fast response to our questions!

BRs/Jakov


On 02. 12. 2021. 00:25, Dan Smith wrote:
> Interesting. It does look like that pendingKrfTags structure is 
> wasting memory.
>
> I think that retained heap of 20 gigabytes might include your entire 
> cache, because those objects have references back to the Cache object. 
> However with 6K oplogs each having an empty map with 4K elements that 
> does add up.
>
> -Dan
> ------------------------------------------------------------------------
> *From:* Jakov Varenina <ja...@est.tech>
> *Sent:* Tuesday, November 30, 2021 5:53 AM
> *To:* dev@geode.apache.org <de...@geode.apache.org>
> *Subject:* Re: Question related to orphaned .drf files in disk-store
>
> Hi Dan and all,
>
>
> Just to provide you the additional picture that better represents the 
> severity of the problem with pendingKrfsTag. So when after you check 
> the second picture in below mail, then please come back and check this 
> one also. Here you can see that the pendingKerfTags is empty and has 
> capacity of 9,192 allocated in memory.
>
>
>
> Sorry for any inconvenience.
>
> BRs/Jakov
>
>
> On 30. 11. 2021. 09:32, Jakov Varenina wrote:
>>
>> Hi Dan,
>>
>>
>> Thank you for your answer!
>>
>>
>> We have identify memory leak in Oplog objects that are representing 
>> orphaned .drf files in heap memory. In below screenshoot you can see 
>> that 7680 onlyDrfOplogs consume more than 18 GB of heap which doesn't 
>> seem correct.
>>
>>
>>
>> In below picture you can see that the 
>> drfOnlyPlog.Oplog.regionMap.pendingKrfTgs structure is responsible 
>> for more then 95% of drfOnlyOplogs heap memory.
>>
>>
>>
>>
>> The pendingKrfTags structure is actually empty (although it consumes 
>> memory because it was used previously and the size of the HashMap was 
>> not reduced) and not used by the onlyDrfOplogs objects. Additionally, 
>> the regionMap.liveEntries linked list has just one element (fake disk 
>> entry OlogDiskEntry indicating that it is empty) and it is also not 
>> used. You can find more details why pedingKrfTags sturcture remianed 
>> in memory for Oplog object representing orphaned .drf file and 
>> possible solution in the following ticket and the PR:
>>
>>
>> https://issues.apache.org/jira/browse/GEODE-9854 
>> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-9854&data=04%7C01%7Cdasmith%40vmware.com%7Cb7d1039e109443468fb508d9b408cd8f%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637738772194842172%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=6Vulpvvh4LsjagU7julIxqYp5%2F%2FkIBzcOikG8jrOKWc%3D&reserved=0>
>>
>> https://github.com/apache/geode/pull/7145 
>> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fpull%2F7145&data=04%7C01%7Cdasmith%40vmware.com%7Cb7d1039e109443468fb508d9b408cd8f%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637738772194852171%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=nA%2FNGaOEg1sR8yYpRUHkhfupciFhhMfwiPyhGv%2BSHnw%3D&reserved=0>
>>
>>
>> BRs/Jakov
>>
>>
>>
>> On 24. 11. 2021. 23:12, Dan Smith wrote:
>>> The .drf file contains destroy records for entries in any older 
>>> oplog. So even if the corresponding .crf file has been deleted, the 
>>> .drf file with the same number still needs to be retained until the 
>>> older .crf files are all deleted.
>>>
>>> 7680 does seem like a lot of oplogs. That data structure is just 
>>> references to the files themselves, I don't think we are keeping the 
>>> contents of the .drf files in memory, except during recovery time.
>>>
>>> -Dan
>>> ------------------------------------------------------------------------
>>> *From:* Jakov Varenina <ja...@est.tech> 
>>> <ma...@est.tech>
>>> *Sent:* Wednesday, November 24, 2021 11:13 AM
>>> *To:* dev@geode.apache.org <ma...@geode.apache.org> 
>>> <de...@geode.apache.org> <ma...@geode.apache.org>
>>> *Subject:* Question related to orphaned .drf files in disk-store
>>> Hi devs,
>>>
>>> We have noticed that disk-store folder can contain orphaned .drf 
>>> files (only .drf file without accompanying .crf and .krf file with 
>>> the same id). Also, we have noticed that these "orphaned" .drf are 
>>> stored in heap (drfOnlyOplogs hash map with size 7680 in below picture):
>>>
>>> Could you please tell us why do geode after compaction sometimes 
>>> only keep .drf and deletes the .crf and .krf files? Why do geode 
>>> need those orphaned .drf files?
>>>
>>> BRs/Jakov
>>>

Re: Question related to orphaned .drf files in disk-store

Posted by Dan Smith <da...@vmware.com>.
Interesting. It does look like that pendingKrfTags structure is wasting memory.

I think that retained heap of 20 gigabytes might include your entire cache, because those objects have references back to the Cache object. However with 6K oplogs each having an empty map with 4K elements that does add up.

-Dan
________________________________
From: Jakov Varenina <ja...@est.tech>
Sent: Tuesday, November 30, 2021 5:53 AM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Re: Question related to orphaned .drf files in disk-store


Hi Dan and all,


Just to provide you the additional picture that better represents the severity of the problem with pendingKrfsTag. So when after you check the second picture in below mail, then please come back and check this one also. Here you can see that the pendingKerfTags is empty and has capacity of 9,192 allocated in memory.


[cid:part1.34321273.4B069259@est.tech]


Sorry for any inconvenience.

BRs/Jakov


On 30. 11. 2021. 09:32, Jakov Varenina wrote:

Hi Dan,


Thank you for your answer!


We have identify memory leak in Oplog objects that are representing orphaned .drf files in heap memory. In below screenshoot you can see that 7680 onlyDrfOplogs consume more than 18 GB of heap which doesn't seem correct.


[cid:part2.37771ED9.3D154D16@est.tech]


In below picture you can see that the drfOnlyPlog.Oplog.regionMap.pendingKrfTgs structure is responsible for more then 95% of drfOnlyOplogs heap memory.



[cid:part3.CBE3F691.1279916D@est.tech]


The pendingKrfTags structure is actually empty (although it consumes memory because it was used previously and the size of the HashMap was not reduced) and not used by the onlyDrfOplogs objects. Additionally, the regionMap.liveEntries linked list has just one element (fake disk entry OlogDiskEntry indicating that it is empty) and it is also not used. You can find more details why pedingKrfTags sturcture remianed in memory for Oplog object representing orphaned .drf file and possible solution in the following ticket and the PR:


https://issues.apache.org/jira/browse/GEODE-9854<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-9854&data=04%7C01%7Cdasmith%40vmware.com%7Cb7d1039e109443468fb508d9b408cd8f%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637738772194842172%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=6Vulpvvh4LsjagU7julIxqYp5%2F%2FkIBzcOikG8jrOKWc%3D&reserved=0>

https://github.com/apache/geode/pull/7145<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fpull%2F7145&data=04%7C01%7Cdasmith%40vmware.com%7Cb7d1039e109443468fb508d9b408cd8f%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637738772194852171%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=nA%2FNGaOEg1sR8yYpRUHkhfupciFhhMfwiPyhGv%2BSHnw%3D&reserved=0>


BRs/Jakov



On 24. 11. 2021. 23:12, Dan Smith wrote:
The .drf file contains destroy records for entries in any older oplog. So even if the corresponding .crf file has been deleted, the .drf file with the same number still needs to be retained until the older .crf files are all deleted.

7680 does seem like a lot of oplogs. That data structure is just references to the files themselves, I don't think we are keeping the contents of the .drf files in memory, except during recovery time.

-Dan
________________________________
From: Jakov Varenina <ja...@est.tech>
Sent: Wednesday, November 24, 2021 11:13 AM
To: dev@geode.apache.org<ma...@geode.apache.org> <de...@geode.apache.org>
Subject: Question related to orphaned .drf files in disk-store

Hi devs,

We have noticed that disk-store folder can contain orphaned .drf files (only .drf file without accompanying .crf and .krf file with the same id). Also, we have noticed that these "orphaned" .drf are stored in heap (drfOnlyOplogs hash map with size 7680 in below picture):

[cid:part1.46E308C2.37688A13@est.tech]

Could you please tell us why do geode after compaction sometimes only keep .drf and deletes the .crf and .krf files? Why do geode need those orphaned .drf files?

BRs/Jakov

Re: Question related to orphaned .drf files in disk-store

Posted by Jakov Varenina <ja...@est.tech>.
Hi Dan and all,


Just to provide you the additional picture that better represents the 
severity of the problem with pendingKrfsTag. So when after you check the 
second picture in below mail, then please come back and check this one 
also. Here you can see that the pendingKerfTags is empty and has 
capacity of 9,192 allocated in memory.



Sorry for any inconvenience.

BRs/Jakov


On 30. 11. 2021. 09:32, Jakov Varenina wrote:
>
> Hi Dan,
>
>
> Thank you for your answer!
>
>
> We have identify memory leak in Oplog objects that are representing 
> orphaned .drf files in heap memory. In below screenshoot you can see 
> that 7680 onlyDrfOplogs consume more than 18 GB of heap which doesn't 
> seem correct.
>
>
>
> In below picture you can see that the 
> drfOnlyPlog.Oplog.regionMap.pendingKrfTgs structure is responsible for 
> more then 95% of drfOnlyOplogs heap memory.
>
>
>
>
> The pendingKrfTags structure is actually empty (although it consumes 
> memory because it was used previously and the size of the HashMap was 
> not reduced) and not used by the onlyDrfOplogs objects. Additionally, 
> the regionMap.liveEntries linked list has just one element (fake disk 
> entry OlogDiskEntry indicating that it is empty) and it is also not 
> used. You can find more details why pedingKrfTags sturcture remianed 
> in memory for Oplog object representing orphaned .drf file and 
> possible solution in the following ticket and the PR:
>
>
> https://issues.apache.org/jira/browse/GEODE-9854
>
> https://github.com/apache/geode/pull/7145
>
>
> BRs/Jakov
>
>
>
> On 24. 11. 2021. 23:12, Dan Smith wrote:
>> The .drf file contains destroy records for entries in any older 
>> oplog. So even if the corresponding .crf file has been deleted, the 
>> .drf file with the same number still needs to be retained until the 
>> older .crf files are all deleted.
>>
>> 7680 does seem like a lot of oplogs. That data structure is just 
>> references to the files themselves, I don't think we are keeping the 
>> contents of the .drf files in memory, except during recovery time.
>>
>> -Dan
>> ------------------------------------------------------------------------
>> *From:* Jakov Varenina <ja...@est.tech>
>> *Sent:* Wednesday, November 24, 2021 11:13 AM
>> *To:* dev@geode.apache.org <de...@geode.apache.org>
>> *Subject:* Question related to orphaned .drf files in disk-store
>> Hi devs,
>>
>> We have noticed that disk-store folder can contain orphaned .drf 
>> files (only .drf file without accompanying .crf and .krf file with 
>> the same id). Also, we have noticed that these "orphaned" .drf are 
>> stored in heap (drfOnlyOplogs hash map with size 7680 in below picture):
>>
>> Could you please tell us why do geode after compaction sometimes only 
>> keep .drf and deletes the .crf and .krf files? Why do geode need 
>> those orphaned .drf files?
>>
>> BRs/Jakov
>>

Re: Question related to orphaned .drf files in disk-store

Posted by Jakov Varenina <ja...@est.tech>.
Hi Dan,


Thank you for your answer!


We have identify memory leak in Oplog objects that are representing 
orphaned .drf files in heap memory. In below screenshoot you can see 
that 7680 onlyDrfOplogs consume more than 18 GB of heap which doesn't 
seem correct.



In below picture you can see that the 
drfOnlyPlog.Oplog.regionMap.pendingKrfTgs structure is responsible for 
more then 95% of drfOnlyOplogs heap memory.




The pendingKrfTags structure is actually empty (although it consumes 
memory because it was used previously and the size of the HashMap was 
not reduced) and not used by the onlyDrfOplogs objects. Additionally, 
the regionMap.liveEntries linked list has just one element (fake disk 
entry OlogDiskEntry indicating that it is empty) and it is also not 
used. You can find more details why pedingKrfTags sturcture remianed in 
memory for Oplog object representing orphaned .drf file and possible 
solution in the following ticket and the PR:


https://issues.apache.org/jira/browse/GEODE-9854

https://github.com/apache/geode/pull/7145


BRs/Jakov



On 24. 11. 2021. 23:12, Dan Smith wrote:
> The .drf file contains destroy records for entries in any older oplog. 
> So even if the corresponding .crf file has been deleted, the .drf file 
> with the same number still needs to be retained until the older .crf 
> files are all deleted.
>
> 7680 does seem like a lot of oplogs. That data structure is just 
> references to the files themselves, I don't think we are keeping the 
> contents of the .drf files in memory, except during recovery time.
>
> -Dan
> ------------------------------------------------------------------------
> *From:* Jakov Varenina <ja...@est.tech>
> *Sent:* Wednesday, November 24, 2021 11:13 AM
> *To:* dev@geode.apache.org <de...@geode.apache.org>
> *Subject:* Question related to orphaned .drf files in disk-store
> Hi devs,
>
> We have noticed that disk-store folder can contain orphaned .drf files 
> (only .drf file without accompanying .crf and .krf file with the same 
> id). Also, we have noticed that these "orphaned" .drf are stored in 
> heap (drfOnlyOplogs hash map with size 7680 in below picture):
>
> Could you please tell us why do geode after compaction sometimes only 
> keep .drf and deletes the .crf and .krf files? Why do geode need those 
> orphaned .drf files?
>
> BRs/Jakov
>

Re: Question related to orphaned .drf files in disk-store

Posted by Dan Smith <da...@vmware.com>.
The .drf file contains destroy records for entries in any older oplog. So even if the corresponding .crf file has been deleted, the .drf file with the same number still needs to be retained until the older .crf files are all deleted.

7680 does seem like a lot of oplogs. That data structure is just references to the files themselves, I don't think we are keeping the contents of the .drf files in memory, except during recovery time.

-Dan
________________________________
From: Jakov Varenina <ja...@est.tech>
Sent: Wednesday, November 24, 2021 11:13 AM
To: dev@geode.apache.org <de...@geode.apache.org>
Subject: Question related to orphaned .drf files in disk-store

Hi devs,

We have noticed that disk-store folder can contain orphaned .drf files (only .drf file without accompanying .crf and .krf file with the same id). Also, we have noticed that these "orphaned" .drf are stored in heap (drfOnlyOplogs hash map with size 7680 in below picture):

[cid:part1.46E308C2.37688A13@est.tech]

Could you please tell us why do geode after compaction sometimes only keep .drf and deletes the .crf and .krf files? Why do geode need those orphaned .drf files?

BRs/Jakov