You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by satoshi hikida <sa...@gmail.com> on 2016/03/16 08:30:46 UTC

What does FileCacheService's log message (invalidating cache) mean?

Hi,

I have been working on some experiments for Cassandra and found some log
messages as follows in debug.log.
I am not sure what it exactly is, so I would appreciate if someone gives me
some explanations about it.

In my verification, a Cassandra node runs as a stand-alone server on Amazon
EC2 instance(t2.medium). And I insert 1 Billion records (about 100GB data
size) to a table from a client application (which runs on another instance
separated from Cassandra node). After insertion, Cassandra continues it's
I/O activities for (probably) compaction and keep logging the messages as
follows:

---
...
DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:25,170
FileCacheService.java:102 - Evicting cold readers for
/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/la-6-big-Data.db
DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:31,780
FileCacheService.java:177 - Invalidating cache for
/var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:36,899
FileCacheService.java:177 - Invalidating cache for
/var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:42,187
FileCacheService.java:177 - Invalidating cache for
/var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:47,308
FileCacheService.java:177 - Invalidating cache for
/var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
...
---

I guess these messages are related to the compaction process and
FileCacheService was invalidating cache which associated with a SSTable
file. But I'm not sure what it does actually mean. When the cache is
invalidated? And What happens is after cache invalidation?


Regards,
Satoshi

Re: What does FileCacheService's log message (invalidating cache) mean?

Posted by Stefania Alborghetti <st...@datastax.com>.
>
> Does this mean RAR(s) must be created and added to the cache
> (FileCacheService) whenever a SSTable is opened even if in case of
> compaction? I think random access read doesn't need to read the data from a
> SSTable in case of compaction because the SSTable had sorted by their keys,
> then only need sequential read to merge.



RandomAccessReader is the name of the class (an implementation of
FileDataInput), but we use them for sequential scanning as well. We don't
necessarily put them in the cache but most of the times we do. On
compaction, if there is a rate limiter on a reader (compaction throughput >
0) then we don't store them in the cache. Disk access mode mmap (the
default on 64 bit machines) is also treated differently, as we cache the
memory mapped segments, not the readers, in 2.2.


On Thu, Mar 17, 2016 at 8:40 PM, Satoshi Hikida <sa...@gmail.com> wrote:

> Hi, Stefania
>
> Thank you for your advice, again!!
>
> Could I ask you for another question?
>
> > Each sstable has one or more random access readers (one per segment for
> example) and FileCacheService is a cache for such readers
> Does this mean RAR(s) must be created and added to the cache
> (FileCacheService) whenever a SSTable is opened even if in case of
> compaction? I think random access read doesn't need to read the data from a
> SSTable in case of compaction because the SSTable had sorted by their keys,
> then only need sequential read to merge.
>
> If I have something wrong, I'm glad if you could correct.
>
>
> Regards,
> Satoshi
>
>
> On Thu, Mar 17, 2016 at 5:19 PM, Stefania Alborghetti <
> stefania.alborghetti@datastax.com> wrote:
>
>> Q1. Readers are created as needed, there is no fixed number. For example,
>> we may have 2 threads scanning sstables at the same time due to 2 different
>> CQL SELECT statements.
>>
>> Q2. There is no correlation between sstable size and JVM HEAP size. We
>> don't load entire sstables in memory.
>>
>> Q3. It's difficult to say what caused the invalidation messages,
>> basically anything that removed sstables from memory, such as dropping the
>> table, snapshots, compactions, streaming, there may me other operations I'm
>> not familiar with.
>>
>> Q4. Correct, these are temporary files. Once again, in 3.0 things are
>> different and the temporary files have been replaced by transaction logs
>> (CASSANDRA-7066).
>>
>>
>> On Thu, Mar 17, 2016 at 3:40 PM, Satoshi Hikida <sa...@imagine-orb.com>
>> wrote:
>>
>>> Sorry there is a mistake in my previous post. I would correct it.
>>>
>>> In Q3, I mentioned there are a lot of invalidating messages in the
>>> debug.log. It is true but cassandra configurations were wrong. In that
>>> case, the cassandra.yaml configurations are as follows:
>>>
>>> - cassandra.yaml
>>> - compaction_throughput_mb_per_sec: 0 (not 8 or default)
>>> - concurrent_compactors: 1
>>> - sstable_preemptive_open_interval_in_mb: 0  (not 8 or default)
>>> - memtable_flush_writers: 1
>>>
>>> And More precisely, in that case, Cassandra keep on outputting
>>> invalidating messages for a while(a few hours). However CPU usage is almost
>>> 0.0% in top command like below.
>>>
>>>     $ top -bu cassandra -n 1
>>>     ...
>>>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
>>> COMMAND
>>>     2631 cassand+  20   0  0.250t 1.969g 703916 S   0.0 57.0   8459:35
>>> java
>>>
>>> I want to know what was actually happening at that time.
>>>
>>>
>>> Regards,
>>> Satoshi
>>>
>>>
>>> On Thu, Mar 17, 2016 at 3:56 PM, Satoshi Hikida <sa...@gmail.com>
>>> wrote:
>>>
>>>> Thank you for your very useful advice!
>>>>
>>>>
>>>> Definitely, I'm using Cassandra V2.2.5 not 3.x. And basically I've
>>>> understood what does these logs mean. But I have more a few questions. So I
>>>> would very much appreciate If I get some explanations about these questions.
>>>>
>>>> * Q1.
>>>> In my understand, when open a SSTable, a lot of
>>>> RandomAccessReaders(RARs) are created. A number of RARs is equal to a
>>>> number of segments of SSTable. Is a number of segments(=RARs) equal to
>>>> follows?
>>>>
>>>> a number of segments = size of SSTable / size of segments
>>>>
>>>> * Q2.
>>>> What is happen if the Cassandra open a SSTable file which bigger than
>>>> JVM heap (or memory)?
>>>>
>>>> * Q3.
>>>> In my case, there are a lot of invalidating messages for the same
>>>> SSTable file (e.g. at least 11 records for tmplink-la-8348-big-Data.db in
>>>> my previous post). In some cases, there are more than 600 invalidating
>>>> messages for the same file and these messages logged for a few hours. Would
>>>> that closing a big SSTable is the cause?
>>>>
>>>> * Q4.
>>>> I saw "tmplink-xxx" or "tmp-xxx" files in the logs and also data
>>>> directories. Are these files temporary in compaction process?
>>>>
>>>>
>>>> Here is my experimental configurations.
>>>>
>>>> - Cassandra node: An aws EC2 instance(t2.medium. 4GBRAM, 2vCPU)
>>>> - Cassandra version: 2.2.5
>>>> - inserted data size: about 100GB
>>>> - cassandra-env.sh: default
>>>> - cassandra.yaml
>>>> - compaction_throughput_mb_per_sec: 8 (or default)
>>>> - concurrent_compactors: 1
>>>> - sstable_preemptive_open_interval_in_mb: 25 (or default)
>>>> - memtable_flush_writers: 1
>>>>
>>>>
>>>> Regards,
>>>> Satoshi
>>>>
>>>>
>>>> On Wed, Mar 16, 2016 at 5:47 PM, Stefania Alborghetti <
>>>> stefania.alborghetti@datastax.com> wrote:
>>>>
>>>>> Each sstable has one or more random access readers (one per segment
>>>>> for example) and FileCacheService is a cache for such readers. When an
>>>>> sstable is closed, the cache is invalidated. If no single reader of an
>>>>> sstable is used for at least 512 milliseconds, all readers are evicted. If
>>>>> the sstable is opened again, new reader(s) will be created and added to the
>>>>> cache again.
>>>>>
>>>>> FileCacheService was removed in cassandra 3.0 in favour of a pool of
>>>>> page-aligned buffers, and sharing the NIO file channels amongst the readers
>>>>> of an sstable, refer to CASSANDRA-8897
>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-8897> and
>>>>> CASSANDRA-8893 <https://issues.apache.org/jira/browse/CASSANDRA-8893>
>>>>> for more details.
>>>>>
>>>>> On Wed, Mar 16, 2016 at 3:30 PM, satoshi hikida <sa...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have been working on some experiments for Cassandra and found some
>>>>>> log messages as follows in debug.log.
>>>>>> I am not sure what it exactly is, so I would appreciate if someone
>>>>>> gives me some explanations about it.
>>>>>>
>>>>>> In my verification, a Cassandra node runs as a stand-alone server on
>>>>>> Amazon EC2 instance(t2.medium). And I insert 1 Billion records (about 100GB
>>>>>> data size) to a table from a client application (which runs on another
>>>>>> instance separated from Cassandra node). After insertion, Cassandra
>>>>>> continues it's I/O activities for (probably) compaction and keep logging
>>>>>> the messages as follows:
>>>>>>
>>>>>> ---
>>>>>> ...
>>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:25,170
>>>>>> FileCacheService.java:102 - Evicting cold readers for
>>>>>> /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/la-6-big-Data.db
>>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:31,780
>>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:36,899
>>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:42,187
>>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:47,308
>>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>>> ...
>>>>>> ---
>>>>>>
>>>>>> I guess these messages are related to the compaction process and
>>>>>> FileCacheService was invalidating cache which associated with a SSTable
>>>>>> file. But I'm not sure what it does actually mean. When the cache is
>>>>>> invalidated? And What happens is after cache invalidation?
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Satoshi
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>> [image: datastax_logo.png] <http://www.datastax.com/>
>>>>>
>>>>> Stefania Alborghetti
>>>>>
>>>>> Apache Cassandra Software Engineer
>>>>>
>>>>> |+852 6114 9265| stefania.alborghetti@datastax.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Stefania Alborghetti
>>
>> Apache Cassandra Software Engineer
>>
>> |+852 6114 9265| stefania.alborghetti@datastax.com
>>
>>
>>
>>
>


-- 


[image: datastax_logo.png] <http://www.datastax.com/>

Stefania Alborghetti

Apache Cassandra Software Engineer

|+852 6114 9265| stefania.alborghetti@datastax.com

Re: What does FileCacheService's log message (invalidating cache) mean?

Posted by Satoshi Hikida <sa...@gmail.com>.
Hi, Stefania

Thank you for your advice, again!!

Could I ask you for another question?

> Each sstable has one or more random access readers (one per segment for
example) and FileCacheService is a cache for such readers
Does this mean RAR(s) must be created and added to the cache
(FileCacheService) whenever a SSTable is opened even if in case of
compaction? I think random access read doesn't need to read the data from a
SSTable in case of compaction because the SSTable had sorted by their keys,
then only need sequential read to merge.

If I have something wrong, I'm glad if you could correct.


Regards,
Satoshi


On Thu, Mar 17, 2016 at 5:19 PM, Stefania Alborghetti <
stefania.alborghetti@datastax.com> wrote:

> Q1. Readers are created as needed, there is no fixed number. For example,
> we may have 2 threads scanning sstables at the same time due to 2 different
> CQL SELECT statements.
>
> Q2. There is no correlation between sstable size and JVM HEAP size. We
> don't load entire sstables in memory.
>
> Q3. It's difficult to say what caused the invalidation messages, basically
> anything that removed sstables from memory, such as dropping the table,
> snapshots, compactions, streaming, there may me other operations I'm not
> familiar with.
>
> Q4. Correct, these are temporary files. Once again, in 3.0 things are
> different and the temporary files have been replaced by transaction logs
> (CASSANDRA-7066).
>
>
> On Thu, Mar 17, 2016 at 3:40 PM, Satoshi Hikida <sa...@imagine-orb.com>
> wrote:
>
>> Sorry there is a mistake in my previous post. I would correct it.
>>
>> In Q3, I mentioned there are a lot of invalidating messages in the
>> debug.log. It is true but cassandra configurations were wrong. In that
>> case, the cassandra.yaml configurations are as follows:
>>
>> - cassandra.yaml
>> - compaction_throughput_mb_per_sec: 0 (not 8 or default)
>> - concurrent_compactors: 1
>> - sstable_preemptive_open_interval_in_mb: 0  (not 8 or default)
>> - memtable_flush_writers: 1
>>
>> And More precisely, in that case, Cassandra keep on outputting
>> invalidating messages for a while(a few hours). However CPU usage is almost
>> 0.0% in top command like below.
>>
>>     $ top -bu cassandra -n 1
>>     ...
>>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
>> COMMAND
>>     2631 cassand+  20   0  0.250t 1.969g 703916 S   0.0 57.0   8459:35
>> java
>>
>> I want to know what was actually happening at that time.
>>
>>
>> Regards,
>> Satoshi
>>
>>
>> On Thu, Mar 17, 2016 at 3:56 PM, Satoshi Hikida <sa...@gmail.com>
>> wrote:
>>
>>> Thank you for your very useful advice!
>>>
>>>
>>> Definitely, I'm using Cassandra V2.2.5 not 3.x. And basically I've
>>> understood what does these logs mean. But I have more a few questions. So I
>>> would very much appreciate If I get some explanations about these questions.
>>>
>>> * Q1.
>>> In my understand, when open a SSTable, a lot of
>>> RandomAccessReaders(RARs) are created. A number of RARs is equal to a
>>> number of segments of SSTable. Is a number of segments(=RARs) equal to
>>> follows?
>>>
>>> a number of segments = size of SSTable / size of segments
>>>
>>> * Q2.
>>> What is happen if the Cassandra open a SSTable file which bigger than
>>> JVM heap (or memory)?
>>>
>>> * Q3.
>>> In my case, there are a lot of invalidating messages for the same
>>> SSTable file (e.g. at least 11 records for tmplink-la-8348-big-Data.db in
>>> my previous post). In some cases, there are more than 600 invalidating
>>> messages for the same file and these messages logged for a few hours. Would
>>> that closing a big SSTable is the cause?
>>>
>>> * Q4.
>>> I saw "tmplink-xxx" or "tmp-xxx" files in the logs and also data
>>> directories. Are these files temporary in compaction process?
>>>
>>>
>>> Here is my experimental configurations.
>>>
>>> - Cassandra node: An aws EC2 instance(t2.medium. 4GBRAM, 2vCPU)
>>> - Cassandra version: 2.2.5
>>> - inserted data size: about 100GB
>>> - cassandra-env.sh: default
>>> - cassandra.yaml
>>> - compaction_throughput_mb_per_sec: 8 (or default)
>>> - concurrent_compactors: 1
>>> - sstable_preemptive_open_interval_in_mb: 25 (or default)
>>> - memtable_flush_writers: 1
>>>
>>>
>>> Regards,
>>> Satoshi
>>>
>>>
>>> On Wed, Mar 16, 2016 at 5:47 PM, Stefania Alborghetti <
>>> stefania.alborghetti@datastax.com> wrote:
>>>
>>>> Each sstable has one or more random access readers (one per segment for
>>>> example) and FileCacheService is a cache for such readers. When an sstable
>>>> is closed, the cache is invalidated. If no single reader of an sstable is
>>>> used for at least 512 milliseconds, all readers are evicted. If the sstable
>>>> is opened again, new reader(s) will be created and added to the cache again.
>>>>
>>>> FileCacheService was removed in cassandra 3.0 in favour of a pool of
>>>> page-aligned buffers, and sharing the NIO file channels amongst the readers
>>>> of an sstable, refer to CASSANDRA-8897
>>>> <https://issues.apache.org/jira/browse/CASSANDRA-8897> and
>>>> CASSANDRA-8893 <https://issues.apache.org/jira/browse/CASSANDRA-8893>
>>>> for more details.
>>>>
>>>> On Wed, Mar 16, 2016 at 3:30 PM, satoshi hikida <sa...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have been working on some experiments for Cassandra and found some
>>>>> log messages as follows in debug.log.
>>>>> I am not sure what it exactly is, so I would appreciate if someone
>>>>> gives me some explanations about it.
>>>>>
>>>>> In my verification, a Cassandra node runs as a stand-alone server on
>>>>> Amazon EC2 instance(t2.medium). And I insert 1 Billion records (about 100GB
>>>>> data size) to a table from a client application (which runs on another
>>>>> instance separated from Cassandra node). After insertion, Cassandra
>>>>> continues it's I/O activities for (probably) compaction and keep logging
>>>>> the messages as follows:
>>>>>
>>>>> ---
>>>>> ...
>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:25,170
>>>>> FileCacheService.java:102 - Evicting cold readers for
>>>>> /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/la-6-big-Data.db
>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:31,780
>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:36,899
>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:42,187
>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:47,308
>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>> ...
>>>>> ---
>>>>>
>>>>> I guess these messages are related to the compaction process and
>>>>> FileCacheService was invalidating cache which associated with a SSTable
>>>>> file. But I'm not sure what it does actually mean. When the cache is
>>>>> invalidated? And What happens is after cache invalidation?
>>>>>
>>>>>
>>>>> Regards,
>>>>> Satoshi
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> [image: datastax_logo.png] <http://www.datastax.com/>
>>>>
>>>> Stefania Alborghetti
>>>>
>>>> Apache Cassandra Software Engineer
>>>>
>>>> |+852 6114 9265| stefania.alborghetti@datastax.com
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
>
> --
>
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Stefania Alborghetti
>
> Apache Cassandra Software Engineer
>
> |+852 6114 9265| stefania.alborghetti@datastax.com
>
>
>
>

Re: What does FileCacheService's log message (invalidating cache) mean?

Posted by Stefania Alborghetti <st...@datastax.com>.
Q1. Readers are created as needed, there is no fixed number. For example,
we may have 2 threads scanning sstables at the same time due to 2 different
CQL SELECT statements.

Q2. There is no correlation between sstable size and JVM HEAP size. We
don't load entire sstables in memory.

Q3. It's difficult to say what caused the invalidation messages, basically
anything that removed sstables from memory, such as dropping the table,
snapshots, compactions, streaming, there may me other operations I'm not
familiar with.

Q4. Correct, these are temporary files. Once again, in 3.0 things are
different and the temporary files have been replaced by transaction logs
(CASSANDRA-7066).


On Thu, Mar 17, 2016 at 3:40 PM, Satoshi Hikida <sa...@imagine-orb.com>
wrote:

> Sorry there is a mistake in my previous post. I would correct it.
>
> In Q3, I mentioned there are a lot of invalidating messages in the
> debug.log. It is true but cassandra configurations were wrong. In that
> case, the cassandra.yaml configurations are as follows:
>
> - cassandra.yaml
> - compaction_throughput_mb_per_sec: 0 (not 8 or default)
> - concurrent_compactors: 1
> - sstable_preemptive_open_interval_in_mb: 0  (not 8 or default)
> - memtable_flush_writers: 1
>
> And More precisely, in that case, Cassandra keep on outputting
> invalidating messages for a while(a few hours). However CPU usage is almost
> 0.0% in top command like below.
>
>     $ top -bu cassandra -n 1
>     ...
>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
> COMMAND
>     2631 cassand+  20   0  0.250t 1.969g 703916 S   0.0 57.0   8459:35 java
>
> I want to know what was actually happening at that time.
>
>
> Regards,
> Satoshi
>
>
> On Thu, Mar 17, 2016 at 3:56 PM, Satoshi Hikida <sa...@gmail.com>
> wrote:
>
>> Thank you for your very useful advice!
>>
>>
>> Definitely, I'm using Cassandra V2.2.5 not 3.x. And basically I've
>> understood what does these logs mean. But I have more a few questions. So I
>> would very much appreciate If I get some explanations about these questions.
>>
>> * Q1.
>> In my understand, when open a SSTable, a lot of RandomAccessReaders(RARs)
>> are created. A number of RARs is equal to a number of segments of SSTable.
>> Is a number of segments(=RARs) equal to follows?
>>
>> a number of segments = size of SSTable / size of segments
>>
>> * Q2.
>> What is happen if the Cassandra open a SSTable file which bigger than JVM
>> heap (or memory)?
>>
>> * Q3.
>> In my case, there are a lot of invalidating messages for the same SSTable
>> file (e.g. at least 11 records for tmplink-la-8348-big-Data.db in my
>> previous post). In some cases, there are more than 600 invalidating
>> messages for the same file and these messages logged for a few hours. Would
>> that closing a big SSTable is the cause?
>>
>> * Q4.
>> I saw "tmplink-xxx" or "tmp-xxx" files in the logs and also data
>> directories. Are these files temporary in compaction process?
>>
>>
>> Here is my experimental configurations.
>>
>> - Cassandra node: An aws EC2 instance(t2.medium. 4GBRAM, 2vCPU)
>> - Cassandra version: 2.2.5
>> - inserted data size: about 100GB
>> - cassandra-env.sh: default
>> - cassandra.yaml
>> - compaction_throughput_mb_per_sec: 8 (or default)
>> - concurrent_compactors: 1
>> - sstable_preemptive_open_interval_in_mb: 25 (or default)
>> - memtable_flush_writers: 1
>>
>>
>> Regards,
>> Satoshi
>>
>>
>> On Wed, Mar 16, 2016 at 5:47 PM, Stefania Alborghetti <
>> stefania.alborghetti@datastax.com> wrote:
>>
>>> Each sstable has one or more random access readers (one per segment for
>>> example) and FileCacheService is a cache for such readers. When an sstable
>>> is closed, the cache is invalidated. If no single reader of an sstable is
>>> used for at least 512 milliseconds, all readers are evicted. If the sstable
>>> is opened again, new reader(s) will be created and added to the cache again.
>>>
>>> FileCacheService was removed in cassandra 3.0 in favour of a pool of
>>> page-aligned buffers, and sharing the NIO file channels amongst the readers
>>> of an sstable, refer to CASSANDRA-8897
>>> <https://issues.apache.org/jira/browse/CASSANDRA-8897> and
>>> CASSANDRA-8893 <https://issues.apache.org/jira/browse/CASSANDRA-8893>
>>> for more details.
>>>
>>> On Wed, Mar 16, 2016 at 3:30 PM, satoshi hikida <sa...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have been working on some experiments for Cassandra and found some
>>>> log messages as follows in debug.log.
>>>> I am not sure what it exactly is, so I would appreciate if someone
>>>> gives me some explanations about it.
>>>>
>>>> In my verification, a Cassandra node runs as a stand-alone server on
>>>> Amazon EC2 instance(t2.medium). And I insert 1 Billion records (about 100GB
>>>> data size) to a table from a client application (which runs on another
>>>> instance separated from Cassandra node). After insertion, Cassandra
>>>> continues it's I/O activities for (probably) compaction and keep logging
>>>> the messages as follows:
>>>>
>>>> ---
>>>> ...
>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:25,170
>>>> FileCacheService.java:102 - Evicting cold readers for
>>>> /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/la-6-big-Data.db
>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:31,780
>>>> FileCacheService.java:177 - Invalidating cache for
>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:36,899
>>>> FileCacheService.java:177 - Invalidating cache for
>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:42,187
>>>> FileCacheService.java:177 - Invalidating cache for
>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:47,308
>>>> FileCacheService.java:177 - Invalidating cache for
>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>> ...
>>>> ---
>>>>
>>>> I guess these messages are related to the compaction process and
>>>> FileCacheService was invalidating cache which associated with a SSTable
>>>> file. But I'm not sure what it does actually mean. When the cache is
>>>> invalidated? And What happens is after cache invalidation?
>>>>
>>>>
>>>> Regards,
>>>> Satoshi
>>>>
>>>
>>>
>>>
>>> --
>>>
>>>
>>> [image: datastax_logo.png] <http://www.datastax.com/>
>>>
>>> Stefania Alborghetti
>>>
>>> Apache Cassandra Software Engineer
>>>
>>> |+852 6114 9265| stefania.alborghetti@datastax.com
>>>
>>>
>>>
>>>
>>
>


-- 


[image: datastax_logo.png] <http://www.datastax.com/>

Stefania Alborghetti

Apache Cassandra Software Engineer

|+852 6114 9265| stefania.alborghetti@datastax.com

Re: What does FileCacheService's log message (invalidating cache) mean?

Posted by Satoshi Hikida <sa...@imagine-orb.com>.
Sorry there is a mistake in my previous post. I would correct it.

In Q3, I mentioned there are a lot of invalidating messages in the
debug.log. It is true but cassandra configurations were wrong. In that
case, the cassandra.yaml configurations are as follows:

- cassandra.yaml
- compaction_throughput_mb_per_sec: 0 (not 8 or default)
- concurrent_compactors: 1
- sstable_preemptive_open_interval_in_mb: 0  (not 8 or default)
- memtable_flush_writers: 1

And More precisely, in that case, Cassandra keep on outputting invalidating
messages for a while(a few hours). However CPU usage is almost 0.0% in top
command like below.

    $ top -bu cassandra -n 1
    ...
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
COMMAND
    2631 cassand+  20   0  0.250t 1.969g 703916 S   0.0 57.0   8459:35 java

I want to know what was actually happening at that time.


Regards,
Satoshi


On Thu, Mar 17, 2016 at 3:56 PM, Satoshi Hikida <sa...@gmail.com> wrote:

> Thank you for your very useful advice!
>
>
> Definitely, I'm using Cassandra V2.2.5 not 3.x. And basically I've
> understood what does these logs mean. But I have more a few questions. So I
> would very much appreciate If I get some explanations about these questions.
>
> * Q1.
> In my understand, when open a SSTable, a lot of RandomAccessReaders(RARs)
> are created. A number of RARs is equal to a number of segments of SSTable.
> Is a number of segments(=RARs) equal to follows?
>
> a number of segments = size of SSTable / size of segments
>
> * Q2.
> What is happen if the Cassandra open a SSTable file which bigger than JVM
> heap (or memory)?
>
> * Q3.
> In my case, there are a lot of invalidating messages for the same SSTable
> file (e.g. at least 11 records for tmplink-la-8348-big-Data.db in my
> previous post). In some cases, there are more than 600 invalidating
> messages for the same file and these messages logged for a few hours. Would
> that closing a big SSTable is the cause?
>
> * Q4.
> I saw "tmplink-xxx" or "tmp-xxx" files in the logs and also data
> directories. Are these files temporary in compaction process?
>
>
> Here is my experimental configurations.
>
> - Cassandra node: An aws EC2 instance(t2.medium. 4GBRAM, 2vCPU)
> - Cassandra version: 2.2.5
> - inserted data size: about 100GB
> - cassandra-env.sh: default
> - cassandra.yaml
> - compaction_throughput_mb_per_sec: 8 (or default)
> - concurrent_compactors: 1
> - sstable_preemptive_open_interval_in_mb: 25 (or default)
> - memtable_flush_writers: 1
>
>
> Regards,
> Satoshi
>
>
> On Wed, Mar 16, 2016 at 5:47 PM, Stefania Alborghetti <
> stefania.alborghetti@datastax.com> wrote:
>
>> Each sstable has one or more random access readers (one per segment for
>> example) and FileCacheService is a cache for such readers. When an sstable
>> is closed, the cache is invalidated. If no single reader of an sstable is
>> used for at least 512 milliseconds, all readers are evicted. If the sstable
>> is opened again, new reader(s) will be created and added to the cache again.
>>
>> FileCacheService was removed in cassandra 3.0 in favour of a pool of
>> page-aligned buffers, and sharing the NIO file channels amongst the readers
>> of an sstable, refer to CASSANDRA-8897
>> <https://issues.apache.org/jira/browse/CASSANDRA-8897> and CASSANDRA-8893
>> <https://issues.apache.org/jira/browse/CASSANDRA-8893> for more details.
>>
>> On Wed, Mar 16, 2016 at 3:30 PM, satoshi hikida <sa...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I have been working on some experiments for Cassandra and found some log
>>> messages as follows in debug.log.
>>> I am not sure what it exactly is, so I would appreciate if someone gives
>>> me some explanations about it.
>>>
>>> In my verification, a Cassandra node runs as a stand-alone server on
>>> Amazon EC2 instance(t2.medium). And I insert 1 Billion records (about 100GB
>>> data size) to a table from a client application (which runs on another
>>> instance separated from Cassandra node). After insertion, Cassandra
>>> continues it's I/O activities for (probably) compaction and keep logging
>>> the messages as follows:
>>>
>>> ---
>>> ...
>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:25,170
>>> FileCacheService.java:102 - Evicting cold readers for
>>> /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/la-6-big-Data.db
>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:31,780
>>> FileCacheService.java:177 - Invalidating cache for
>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:36,899
>>> FileCacheService.java:177 - Invalidating cache for
>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:42,187
>>> FileCacheService.java:177 - Invalidating cache for
>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:47,308
>>> FileCacheService.java:177 - Invalidating cache for
>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>> ...
>>> ---
>>>
>>> I guess these messages are related to the compaction process and
>>> FileCacheService was invalidating cache which associated with a SSTable
>>> file. But I'm not sure what it does actually mean. When the cache is
>>> invalidated? And What happens is after cache invalidation?
>>>
>>>
>>> Regards,
>>> Satoshi
>>>
>>
>>
>>
>> --
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Stefania Alborghetti
>>
>> Apache Cassandra Software Engineer
>>
>> |+852 6114 9265| stefania.alborghetti@datastax.com
>>
>>
>>
>>
>

Re: What does FileCacheService's log message (invalidating cache) mean?

Posted by Satoshi Hikida <sa...@gmail.com>.
Thank you for your very useful advice!


Definitely, I'm using Cassandra V2.2.5 not 3.x. And basically I've
understood what does these logs mean. But I have more a few questions. So I
would very much appreciate If I get some explanations about these questions.

* Q1.
In my understand, when open a SSTable, a lot of RandomAccessReaders(RARs)
are created. A number of RARs is equal to a number of segments of SSTable.
Is a number of segments(=RARs) equal to follows?

a number of segments = size of SSTable / size of segments

* Q2.
What is happen if the Cassandra open a SSTable file which bigger than JVM
heap (or memory)?

* Q3.
In my case, there are a lot of invalidating messages for the same SSTable
file (e.g. at least 11 records for tmplink-la-8348-big-Data.db in my
previous post). In some cases, there are more than 600 invalidating
messages for the same file and these messages logged for a few hours. Would
that closing a big SSTable is the cause?

* Q4.
I saw "tmplink-xxx" or "tmp-xxx" files in the logs and also data
directories. Are these files temporary in compaction process?


Here is my experimental configurations.

- Cassandra node: An aws EC2 instance(t2.medium. 4GBRAM, 2vCPU)
- Cassandra version: 2.2.5
- inserted data size: about 100GB
- cassandra-env.sh: default
- cassandra.yaml
- compaction_throughput_mb_per_sec: 8 (or default)
- concurrent_compactors: 1
- sstable_preemptive_open_interval_in_mb: 25 (or default)
- memtable_flush_writers: 1


Regards,
Satoshi


On Wed, Mar 16, 2016 at 5:47 PM, Stefania Alborghetti <
stefania.alborghetti@datastax.com> wrote:

> Each sstable has one or more random access readers (one per segment for
> example) and FileCacheService is a cache for such readers. When an sstable
> is closed, the cache is invalidated. If no single reader of an sstable is
> used for at least 512 milliseconds, all readers are evicted. If the sstable
> is opened again, new reader(s) will be created and added to the cache again.
>
> FileCacheService was removed in cassandra 3.0 in favour of a pool of
> page-aligned buffers, and sharing the NIO file channels amongst the readers
> of an sstable, refer to CASSANDRA-8897
> <https://issues.apache.org/jira/browse/CASSANDRA-8897> and CASSANDRA-8893
> <https://issues.apache.org/jira/browse/CASSANDRA-8893> for more details.
>
> On Wed, Mar 16, 2016 at 3:30 PM, satoshi hikida <sa...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I have been working on some experiments for Cassandra and found some log
>> messages as follows in debug.log.
>> I am not sure what it exactly is, so I would appreciate if someone gives
>> me some explanations about it.
>>
>> In my verification, a Cassandra node runs as a stand-alone server on
>> Amazon EC2 instance(t2.medium). And I insert 1 Billion records (about 100GB
>> data size) to a table from a client application (which runs on another
>> instance separated from Cassandra node). After insertion, Cassandra
>> continues it's I/O activities for (probably) compaction and keep logging
>> the messages as follows:
>>
>> ---
>> ...
>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:25,170
>> FileCacheService.java:102 - Evicting cold readers for
>> /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/la-6-big-Data.db
>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:31,780
>> FileCacheService.java:177 - Invalidating cache for
>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:36,899
>> FileCacheService.java:177 - Invalidating cache for
>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:42,187
>> FileCacheService.java:177 - Invalidating cache for
>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:47,308
>> FileCacheService.java:177 - Invalidating cache for
>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>> ...
>> ---
>>
>> I guess these messages are related to the compaction process and
>> FileCacheService was invalidating cache which associated with a SSTable
>> file. But I'm not sure what it does actually mean. When the cache is
>> invalidated? And What happens is after cache invalidation?
>>
>>
>> Regards,
>> Satoshi
>>
>
>
>
> --
>
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Stefania Alborghetti
>
> Apache Cassandra Software Engineer
>
> |+852 6114 9265| stefania.alborghetti@datastax.com
>
>
>
>

Re: What does FileCacheService's log message (invalidating cache) mean?

Posted by Stefania Alborghetti <st...@datastax.com>.
Each sstable has one or more random access readers (one per segment for
example) and FileCacheService is a cache for such readers. When an sstable
is closed, the cache is invalidated. If no single reader of an sstable is
used for at least 512 milliseconds, all readers are evicted. If the sstable
is opened again, new reader(s) will be created and added to the cache again.

FileCacheService was removed in cassandra 3.0 in favour of a pool of
page-aligned buffers, and sharing the NIO file channels amongst the readers
of an sstable, refer to CASSANDRA-8897
<https://issues.apache.org/jira/browse/CASSANDRA-8897> and CASSANDRA-8893
<https://issues.apache.org/jira/browse/CASSANDRA-8893> for more details.

On Wed, Mar 16, 2016 at 3:30 PM, satoshi hikida <sa...@gmail.com> wrote:

> Hi,
>
> I have been working on some experiments for Cassandra and found some log
> messages as follows in debug.log.
> I am not sure what it exactly is, so I would appreciate if someone gives
> me some explanations about it.
>
> In my verification, a Cassandra node runs as a stand-alone server on
> Amazon EC2 instance(t2.medium). And I insert 1 Billion records (about 100GB
> data size) to a table from a client application (which runs on another
> instance separated from Cassandra node). After insertion, Cassandra
> continues it's I/O activities for (probably) compaction and keep logging
> the messages as follows:
>
> ---
> ...
> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:25,170
> FileCacheService.java:102 - Evicting cold readers for
> /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/la-6-big-Data.db
> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:31,780
> FileCacheService.java:177 - Invalidating cache for
> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:36,899
> FileCacheService.java:177 - Invalidating cache for
> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:42,187
> FileCacheService.java:177 - Invalidating cache for
> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:47,308
> FileCacheService.java:177 - Invalidating cache for
> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
> ...
> ---
>
> I guess these messages are related to the compaction process and
> FileCacheService was invalidating cache which associated with a SSTable
> file. But I'm not sure what it does actually mean. When the cache is
> invalidated? And What happens is after cache invalidation?
>
>
> Regards,
> Satoshi
>



-- 


[image: datastax_logo.png] <http://www.datastax.com/>

Stefania Alborghetti

Apache Cassandra Software Engineer

|+852 6114 9265| stefania.alborghetti@datastax.com