You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by David Tinker <da...@gmail.com> on 2020/11/17 03:17:51 UTC

On disk compression

I have enabled compression
(pageSize=16384, diskPageCompression=ZSTD, diskPageCompressionLevel=18) but
the partition files don't appear to be very compressed. I tested by adding
approx 16000 data items to my cache and looking at the partition files on
disk.

Example: part-96.bin is 339M in size. If I compress that file with zstd
(default settings) it goes down to 106M.

Is it possible to do better than this with Ignite? I need to be able to
store a lot of data.

Thanks
David

Relevant parts of my ignite config:

    <bean id="grid.cfg"
class="org.apache.ignite.configuration.IgniteConfiguration">
        <property name="consistentId" value=""/>

        <property name="dataStorageConfiguration">
            <bean
class="org.apache.ignite.configuration.DataStorageConfiguration">
                <property name="pageSize" value="16384"/>
...
            </bean>
        </property>

        <property name="cacheConfiguration">
            <bean
class="org.apache.ignite.configuration.CacheConfiguration">
                <property name="name" value="activity-stream-data"/>
                <property name="atomicityMode" value="ATOMIC"/>
                <property name="diskPageCompression" value="ZSTD"/>
                <property name="diskPageCompressionLevel" value="18"/>
                <property name="backups" value="1"/>
            </bean>
        </property>
    </bean>

Re: On disk compression

Posted by Alex Plehanov <pl...@gmail.com>.
If you have a write-heavy workload, to reduce disk usage you can also
compress WAL (see "WAL compaction" and "WAL page snapshots compression"
features).
I'm not sure about ZSTD compression levels, you can try it. But there is a
warning in the ZSTD manual: "Levels >= 20 should be used with caution, as
they require more memory". Perhaps someone who is more familiar with ZSTD
will answer how higher compression levels affect resource consumption
during decompression.

вт, 17 нояб. 2020 г. в 11:00, David Tinker <da...@gmail.com>:

> Aha! I didn't know about the sparse file thing. Thanks!
>
> # ll -hs
> 159M -rw-r--r-- 1 ignite ignite 339M Nov 16 21:32 part-96.bin
>
> So the real space used is only 159M. That's great. I currently have all of
> this data stored on the filesystem in  csv.gz files using 177M of space for
> the 16000 I tested with.
>
> Any other tips on how to reduce disk usage? Any point in using compression
> level more than 18 for ZSTD? Most of this data will only be written once so
> I am not so concerned about write speed.
>
>
> On Tue, Nov 17, 2020 at 9:34 AM Alex Plehanov <pl...@gmail.com>
> wrote:
>
>> Hello,
>>
>> Ignite compresses each page individually. The result of whole file
>> compression will always be better than the result of each individual page
>> compression. Moreover, Ignite stores compressed pages only if the page size
>> shrunk by one or more filesystem blocks. So, for example, if you have fs
>> block size 4K, page size 16Kb and after compression your page size is 13Kb,
>> then the page will be stored without compression.
>>
>> BTW, how do you check file size? Ignite compression uses sparse files.
>> "ls -l" reports allocated file size and doesn't utilize information about
>> "holes" in a sparse file. To see the real amount of disk space occupied by
>> the file you should use "du" or "ls -s".
>>
>>
>> вт, 17 нояб. 2020 г. в 06:18, David Tinker <da...@gmail.com>:
>>
>>> I have enabled compression
>>> (pageSize=16384, diskPageCompression=ZSTD, diskPageCompressionLevel=18) but
>>> the partition files don't appear to be very compressed. I tested by adding
>>> approx 16000 data items to my cache and looking at the partition files on
>>> disk.
>>>
>>> Example: part-96.bin is 339M in size. If I compress that file with zstd
>>> (default settings) it goes down to 106M.
>>>
>>> Is it possible to do better than this with Ignite? I need to be able to
>>> store a lot of data.
>>>
>>> Thanks
>>> David
>>>
>>> Relevant parts of my ignite config:
>>>
>>>     <bean id="grid.cfg"
>>> class="org.apache.ignite.configuration.IgniteConfiguration">
>>>         <property name="consistentId" value=""/>
>>>
>>>         <property name="dataStorageConfiguration">
>>>             <bean
>>> class="org.apache.ignite.configuration.DataStorageConfiguration">
>>>                 <property name="pageSize" value="16384"/>
>>> ...
>>>             </bean>
>>>         </property>
>>>
>>>         <property name="cacheConfiguration">
>>>             <bean
>>> class="org.apache.ignite.configuration.CacheConfiguration">
>>>                 <property name="name" value="activity-stream-data"/>
>>>                 <property name="atomicityMode" value="ATOMIC"/>
>>>                 <property name="diskPageCompression" value="ZSTD"/>
>>>                 <property name="diskPageCompressionLevel" value="18"/>
>>>                 <property name="backups" value="1"/>
>>>             </bean>
>>>         </property>
>>>     </bean>
>>>
>>>

Re: On disk compression

Posted by David Tinker <da...@gmail.com>.
Aha! I didn't know about the sparse file thing. Thanks!

# ll -hs
159M -rw-r--r-- 1 ignite ignite 339M Nov 16 21:32 part-96.bin

So the real space used is only 159M. That's great. I currently have all of
this data stored on the filesystem in  csv.gz files using 177M of space for
the 16000 I tested with.

Any other tips on how to reduce disk usage? Any point in using compression
level more than 18 for ZSTD? Most of this data will only be written once so
I am not so concerned about write speed.


On Tue, Nov 17, 2020 at 9:34 AM Alex Plehanov <pl...@gmail.com>
wrote:

> Hello,
>
> Ignite compresses each page individually. The result of whole file
> compression will always be better than the result of each individual page
> compression. Moreover, Ignite stores compressed pages only if the page size
> shrunk by one or more filesystem blocks. So, for example, if you have fs
> block size 4K, page size 16Kb and after compression your page size is 13Kb,
> then the page will be stored without compression.
>
> BTW, how do you check file size? Ignite compression uses sparse files. "ls
> -l" reports allocated file size and doesn't utilize information about
> "holes" in a sparse file. To see the real amount of disk space occupied by
> the file you should use "du" or "ls -s".
>
>
> вт, 17 нояб. 2020 г. в 06:18, David Tinker <da...@gmail.com>:
>
>> I have enabled compression
>> (pageSize=16384, diskPageCompression=ZSTD, diskPageCompressionLevel=18) but
>> the partition files don't appear to be very compressed. I tested by adding
>> approx 16000 data items to my cache and looking at the partition files on
>> disk.
>>
>> Example: part-96.bin is 339M in size. If I compress that file with zstd
>> (default settings) it goes down to 106M.
>>
>> Is it possible to do better than this with Ignite? I need to be able to
>> store a lot of data.
>>
>> Thanks
>> David
>>
>> Relevant parts of my ignite config:
>>
>>     <bean id="grid.cfg"
>> class="org.apache.ignite.configuration.IgniteConfiguration">
>>         <property name="consistentId" value=""/>
>>
>>         <property name="dataStorageConfiguration">
>>             <bean
>> class="org.apache.ignite.configuration.DataStorageConfiguration">
>>                 <property name="pageSize" value="16384"/>
>> ...
>>             </bean>
>>         </property>
>>
>>         <property name="cacheConfiguration">
>>             <bean
>> class="org.apache.ignite.configuration.CacheConfiguration">
>>                 <property name="name" value="activity-stream-data"/>
>>                 <property name="atomicityMode" value="ATOMIC"/>
>>                 <property name="diskPageCompression" value="ZSTD"/>
>>                 <property name="diskPageCompressionLevel" value="18"/>
>>                 <property name="backups" value="1"/>
>>             </bean>
>>         </property>
>>     </bean>
>>
>>

Re: On disk compression

Posted by Alex Plehanov <pl...@gmail.com>.
Hello,

Ignite compresses each page individually. The result of whole file
compression will always be better than the result of each individual page
compression. Moreover, Ignite stores compressed pages only if the page size
shrunk by one or more filesystem blocks. So, for example, if you have fs
block size 4K, page size 16Kb and after compression your page size is 13Kb,
then the page will be stored without compression.

BTW, how do you check file size? Ignite compression uses sparse files. "ls
-l" reports allocated file size and doesn't utilize information about
"holes" in a sparse file. To see the real amount of disk space occupied by
the file you should use "du" or "ls -s".


вт, 17 нояб. 2020 г. в 06:18, David Tinker <da...@gmail.com>:

> I have enabled compression
> (pageSize=16384, diskPageCompression=ZSTD, diskPageCompressionLevel=18) but
> the partition files don't appear to be very compressed. I tested by adding
> approx 16000 data items to my cache and looking at the partition files on
> disk.
>
> Example: part-96.bin is 339M in size. If I compress that file with zstd
> (default settings) it goes down to 106M.
>
> Is it possible to do better than this with Ignite? I need to be able to
> store a lot of data.
>
> Thanks
> David
>
> Relevant parts of my ignite config:
>
>     <bean id="grid.cfg"
> class="org.apache.ignite.configuration.IgniteConfiguration">
>         <property name="consistentId" value=""/>
>
>         <property name="dataStorageConfiguration">
>             <bean
> class="org.apache.ignite.configuration.DataStorageConfiguration">
>                 <property name="pageSize" value="16384"/>
> ...
>             </bean>
>         </property>
>
>         <property name="cacheConfiguration">
>             <bean
> class="org.apache.ignite.configuration.CacheConfiguration">
>                 <property name="name" value="activity-stream-data"/>
>                 <property name="atomicityMode" value="ATOMIC"/>
>                 <property name="diskPageCompression" value="ZSTD"/>
>                 <property name="diskPageCompressionLevel" value="18"/>
>                 <property name="backups" value="1"/>
>             </bean>
>         </property>
>     </bean>
>
>