You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@asterixdb.apache.org by Taewoo Kim <wa...@gmail.com> on 2016/10/13 21:56:38 UTC

Re: How to set the lsm component size?

The explanation changes for the two parameters have been merged into the
master.

https://asterix-gerrit.ics.uci.edu/#/c/1281/3/asterixdb/asterix-installer/src/main/resources/conf/asterix-configuration.xml

Best,
Taewoo

On Mon, Sep 12, 2016 at 5:02 PM, Taewoo Kim <wa...@gmail.com> wrote:

> Thanks to Sattam, here is the revised version. Feel free to revise this. I
> will upload a patch set after some revision is done.
>
> *storage.memorycomponent.numpages*
>
> The number of pages to allocate for a memory component. (Default = 256)
> This budget is shared by all the memory components of the primary index
> and all its secondary indexes across all I/O devices on a node.
> Note: in-memory components usually has fill factor of 75% since the pages
> are 75% full and the remaining 25% is un-utilized.
>
>
> *storage.memorycomponent.globalbudget*
>
> [4GB + 100MB] The total size of memory in bytes that the sum of all open
> memory components cannot exceed. (Default = 512MB)
> Consider this as the buffer cache for all memory components of all indexes
> in a node.
> When this budget is fully used, a victim dataset will be chosen. It must
> be evicted and closed to make a space for another dataset.
>
>
> Best,
> Taewoo
>
> On Mon, Sep 12, 2016 at 4:10 PM, Mike Carey <dt...@gmail.com> wrote:
>
>> +1
>>
>>
>>
>> On 9/12/16 3:42 PM, Taewoo Kim wrote:
>>
>>> It would be really helpful this conversation can be applied in the
>>> description of each parameter. Currently, I think that is too short.
>>>
>>> Best,
>>> Taewoo
>>>
>>> On Mon, Sep 12, 2016 at 2:19 PM, Jianfeng Jia <ji...@gmail.com>
>>> wrote:
>>>
>>> Clear. Thanks.
>>>>
>>>> And Ian’s parameters works. I can have a on-disk components around 128M.
>>>> Thanks!
>>>>
>>>> On Sep 12, 2016, at 12:50 PM, Sattam Alsubaiee <sa...@gmail.com>
>>>>>
>>>> wrote:
>>>>
>>>>> This is the total memory size given for all datasets. Think of it as
>>>>> the
>>>>> buffer cache for all memory components of all indexes in that machine.
>>>>>
>>>> When
>>>>
>>>>> it is exhausted, a victim dataset must be evicted and closed to have a
>>>>> space for another dataset.
>>>>>
>>>>> On Mon, Sep 12, 2016 at 12:29 PM, Jianfeng Jia <jianfeng.jia@gmail.com
>>>>> >
>>>>> wrote:
>>>>>
>>>>> I was a little confused, there is another configuration:
>>>>>>
>>>>>> storage.memorycomponent.globalbudget ( which I set to 4G)
>>>>>>
>>>>>> I was thinking this is the budget that every component on one
>>>>>> partition
>>>>>>
>>>>> is
>>>>
>>>>> shared. Is that the case?
>>>>>>
>>>>>> On Sep 12, 2016, at 12:16 PM, Sattam Alsubaiee <sa...@gmail.com>
>>>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>> The 128M is shared by all the memory components of the primary index
>>>>>>>
>>>>>> and
>>>>
>>>>> all its secondary indexes across all io devices on that node.
>>>>>>> Also the in-memory components usually usually has fill factor of 75%
>>>>>>>
>>>>>> since
>>>>>>
>>>>>>> the pages are 75% full and the remaining 25% is un-utilized.
>>>>>>>
>>>>>>> The page size that you have set 128KB looks reasonable for most
>>>>>>> cases.
>>>>>>>
>>>>>> Your
>>>>>>
>>>>>>> best bet is to increase the value of storage.memorycomponent.numpage
>>>>>>>
>>>>>> to
>>>>
>>>>> a
>>>>>>
>>>>>>> higher number.
>>>>>>>
>>>>>>> Sattam
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Sep 12, 2016 at 11:33 AM, Jianfeng Jia <
>>>>>>> jianfeng.jia@gmail.com
>>>>>>> wrote:
>>>>>>>
>>>>>>> Dear devs,
>>>>>>>>
>>>>>>>> I’m using the `no-merge` compaction policy and find that the
>>>>>>>> physical
>>>>>>>> flushed on-disk component is smaller than I was expected.
>>>>>>>>
>>>>>>>> Here are my related configurations
>>>>>>>>
>>>>>>>> <property>
>>>>>>>>    <name>storage.memorycomponent.pagesize</name>
>>>>>>>>    <value>128KB</value>
>>>>>>>>    <description>The page size in bytes for pages allocated to memory
>>>>>>>>      components. (Default = "131072" // 128KB)
>>>>>>>>    </description>
>>>>>>>> </property>
>>>>>>>>
>>>>>>>> <property>
>>>>>>>>    <name>storage.memorycomponent.numpages</name>
>>>>>>>>    <value>1024</value>
>>>>>>>>    <description>The number of pages to allocate for a memory
>>>>>>>> component.
>>>>>>>>      (Default = 256)
>>>>>>>>    </description>
>>>>>>>> </property>
>>>>>>>>
>>>>>>>> With these two settings, I’m expecting the lsm component should be
>>>>>>>>
>>>>>>> 128M.
>>>>
>>>>> However, the flushed one is about 16M~ 20M. Do we have some
>>>>>>>>
>>>>>>> compression
>>>>
>>>>> for
>>>>>>
>>>>>>> the on-disk components? If so, it will be good. Otherwise, could
>>>>>>>>
>>>>>>> someone
>>>>
>>>>> help me to increase the component size? Thanks!
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Jianfeng Jia
>>>>>>>> PhD Candidate of Computer Science
>>>>>>>> University of California, Irvine
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Jianfeng Jia
>>>>>> PhD Candidate of Computer Science
>>>>>> University of California, Irvine
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>> Best,
>>>>
>>>> Jianfeng Jia
>>>> PhD Candidate of Computer Science
>>>> University of California, Irvine
>>>>
>>>>
>>>>
>>
>

Re: How to set the lsm component size?

Posted by Jianfeng Jia <ji...@gmail.com>.
Nice!
> On Oct 13, 2016, at 2:56 PM, Taewoo Kim <wa...@gmail.com> wrote:
> 
> The explanation changes for the two parameters have been merged into the
> master.
> 
> https://asterix-gerrit.ics.uci.edu/#/c/1281/3/asterixdb/asterix-installer/src/main/resources/conf/asterix-configuration.xml
> 
> Best,
> Taewoo
> 
> On Mon, Sep 12, 2016 at 5:02 PM, Taewoo Kim <wa...@gmail.com> wrote:
> 
>> Thanks to Sattam, here is the revised version. Feel free to revise this. I
>> will upload a patch set after some revision is done.
>> 
>> *storage.memorycomponent.numpages*
>> 
>> The number of pages to allocate for a memory component. (Default = 256)
>> This budget is shared by all the memory components of the primary index
>> and all its secondary indexes across all I/O devices on a node.
>> Note: in-memory components usually has fill factor of 75% since the pages
>> are 75% full and the remaining 25% is un-utilized.
>> 
>> 
>> *storage.memorycomponent.globalbudget*
>> 
>> [4GB + 100MB] The total size of memory in bytes that the sum of all open
>> memory components cannot exceed. (Default = 512MB)
>> Consider this as the buffer cache for all memory components of all indexes
>> in a node.
>> When this budget is fully used, a victim dataset will be chosen. It must
>> be evicted and closed to make a space for another dataset.
>> 
>> 
>> Best,
>> Taewoo
>> 
>> On Mon, Sep 12, 2016 at 4:10 PM, Mike Carey <dt...@gmail.com> wrote:
>> 
>>> +1
>>> 
>>> 
>>> 
>>> On 9/12/16 3:42 PM, Taewoo Kim wrote:
>>> 
>>>> It would be really helpful this conversation can be applied in the
>>>> description of each parameter. Currently, I think that is too short.
>>>> 
>>>> Best,
>>>> Taewoo
>>>> 
>>>> On Mon, Sep 12, 2016 at 2:19 PM, Jianfeng Jia <ji...@gmail.com>
>>>> wrote:
>>>> 
>>>> Clear. Thanks.
>>>>> 
>>>>> And Ian’s parameters works. I can have a on-disk components around 128M.
>>>>> Thanks!
>>>>> 
>>>>> On Sep 12, 2016, at 12:50 PM, Sattam Alsubaiee <sa...@gmail.com>
>>>>>> 
>>>>> wrote:
>>>>> 
>>>>>> This is the total memory size given for all datasets. Think of it as
>>>>>> the
>>>>>> buffer cache for all memory components of all indexes in that machine.
>>>>>> 
>>>>> When
>>>>> 
>>>>>> it is exhausted, a victim dataset must be evicted and closed to have a
>>>>>> space for another dataset.
>>>>>> 
>>>>>> On Mon, Sep 12, 2016 at 12:29 PM, Jianfeng Jia <jianfeng.jia@gmail.com
>>>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>> I was a little confused, there is another configuration:
>>>>>>> 
>>>>>>> storage.memorycomponent.globalbudget ( which I set to 4G)
>>>>>>> 
>>>>>>> I was thinking this is the budget that every component on one
>>>>>>> partition
>>>>>>> 
>>>>>> is
>>>>> 
>>>>>> shared. Is that the case?
>>>>>>> 
>>>>>>> On Sep 12, 2016, at 12:16 PM, Sattam Alsubaiee <sa...@gmail.com>
>>>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> The 128M is shared by all the memory components of the primary index
>>>>>>>> 
>>>>>>> and
>>>>> 
>>>>>> all its secondary indexes across all io devices on that node.
>>>>>>>> Also the in-memory components usually usually has fill factor of 75%
>>>>>>>> 
>>>>>>> since
>>>>>>> 
>>>>>>>> the pages are 75% full and the remaining 25% is un-utilized.
>>>>>>>> 
>>>>>>>> The page size that you have set 128KB looks reasonable for most
>>>>>>>> cases.
>>>>>>>> 
>>>>>>> Your
>>>>>>> 
>>>>>>>> best bet is to increase the value of storage.memorycomponent.numpage
>>>>>>>> 
>>>>>>> to
>>>>> 
>>>>>> a
>>>>>>> 
>>>>>>>> higher number.
>>>>>>>> 
>>>>>>>> Sattam
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Sep 12, 2016 at 11:33 AM, Jianfeng Jia <
>>>>>>>> jianfeng.jia@gmail.com
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Dear devs,
>>>>>>>>> 
>>>>>>>>> I’m using the `no-merge` compaction policy and find that the
>>>>>>>>> physical
>>>>>>>>> flushed on-disk component is smaller than I was expected.
>>>>>>>>> 
>>>>>>>>> Here are my related configurations
>>>>>>>>> 
>>>>>>>>> <property>
>>>>>>>>>   <name>storage.memorycomponent.pagesize</name>
>>>>>>>>>   <value>128KB</value>
>>>>>>>>>   <description>The page size in bytes for pages allocated to memory
>>>>>>>>>     components. (Default = "131072" // 128KB)
>>>>>>>>>   </description>
>>>>>>>>> </property>
>>>>>>>>> 
>>>>>>>>> <property>
>>>>>>>>>   <name>storage.memorycomponent.numpages</name>
>>>>>>>>>   <value>1024</value>
>>>>>>>>>   <description>The number of pages to allocate for a memory
>>>>>>>>> component.
>>>>>>>>>     (Default = 256)
>>>>>>>>>   </description>
>>>>>>>>> </property>
>>>>>>>>> 
>>>>>>>>> With these two settings, I’m expecting the lsm component should be
>>>>>>>>> 
>>>>>>>> 128M.
>>>>> 
>>>>>> However, the flushed one is about 16M~ 20M. Do we have some
>>>>>>>>> 
>>>>>>>> compression
>>>>> 
>>>>>> for
>>>>>>> 
>>>>>>>> the on-disk components? If so, it will be good. Otherwise, could
>>>>>>>>> 
>>>>>>>> someone
>>>>> 
>>>>>> help me to increase the component size? Thanks!
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> 
>>>>>>>>> Jianfeng Jia
>>>>>>>>> PhD Candidate of Computer Science
>>>>>>>>> University of California, Irvine
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>>> Best,
>>>>>>> 
>>>>>>> Jianfeng Jia
>>>>>>> PhD Candidate of Computer Science
>>>>>>> University of California, Irvine
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> Best,
>>>>> 
>>>>> Jianfeng Jia
>>>>> PhD Candidate of Computer Science
>>>>> University of California, Irvine
>>>>> 
>>>>> 
>>>>> 
>>> 
>>