You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by Zhangshunyu <zh...@126.com> on 2016/09/26 02:06:19 UTC

[Discuss]Set block_size for table on table level

Purpose:
To configure block file size for each table on column level, so that each
table could has its own blocksize.
My solution:
Add a new parameter in table properties, when create a table, the user can
set it in ddl. Add a parameter in thrift format just like other properties,
and write this info into thrift file so that this info would not lost when
cluster is restarted.

What's your opinion?



--
View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discuss-Set-block-size-for-table-on-table-level-tp1472.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com.

Re: [Discuss]Set block_size for table on table level

Posted by Venkata Gollamudi <g....@gmail.com>.
+1 agree with others comments

On Tue, Sep 27, 2016, 12:16 AM Jihong Ma <Ji...@huawei.com> wrote:

> +1, To avoid potential compatibility issue, we could introduce this param
> as an optional field, as long as it is not a required field, we are fine
> with a defined default block size.
>
> Regards.
>
> Jihong
>
> -----Original Message-----
> From: Jacky Li [mailto:jacky.likun@qq.com]
> Sent: Monday, September 26, 2016 7:29 AM
> To: dev@carbondata.incubator.apache.org
> Subject: Re: [Discuss]Set block_size for table on table level
>
> I am OK with this feature, the only thing I am worrying about is the
> compatibility of CarbonData file reader. Can you make it compatible when
> you reading old CarbonData file without this property.
> We have encountered many times that user need to delete the store and
> re-load the data.
>
> Regards,
> Jacky
>
> > 在 2016年9月26日,下午2:15,Ravindra Pesala <ra...@gmail.com> 写道:
> >
> > +1
> > At same time max and min block size should be restricted and validated
> > while creating table.
> >
> > On 26 September 2016 at 07:36, Zhangshunyu <zh...@126.com>
> wrote:
> >
> >> Purpose:
> >> To configure block file size for each table on column level, so that
> each
> >> table could has its own blocksize.
> >> My solution:
> >> Add a new parameter in table properties, when create a table, the user
> can
> >> set it in ddl. Add a parameter in thrift format just like other
> properties,
> >> and write this info into thrift file so that this info would not lost
> when
> >> cluster is restarted.
> >>
> >> What's your opinion?
> >>
> >>
> >>
> >> --
> >> View this message in context: http://apache-carbondata-
> >> mailing-list-archive.1130556.n5.nabble.com/Discuss-Set-
> >> block-size-for-table-on-table-level-tp1472.html
> >> Sent from the Apache CarbonData Mailing List archive mailing list
> archive
> >> at Nabble.com.
> >>
> >
> >
> > --
> > Thanks & Regards,
> > Ravi
>
>
>
>

Re: 回复:[Discuss]Set block_size for table on table level

Posted by Zhangshunyu <zh...@126.com>.
I have verified that it would not affect the older tables.



--
View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discuss-Set-block-size-for-table-on-table-level-tp1472p1531.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com.

回复:[Discuss]Set block_size for table on table level

Posted by mreason <mr...@aliyun.com>.
+1, agree with jihong.------------------------------------------------------------------发件人:金铸 <ji...@neusoft.com>发送时间:2016年9月27日(星期二) 08:22收件人:dev <de...@carbondata.incubator.apache.org>主 题:Re: [Discuss]Set block_size for table on table level
+1,agree with jihong.


在 2016/9/27 5:12, chenliang613 写道:
> +1, agree with Jihong's comment : make it as optional, usually the default
> block size will be used if user don't specially define it.
>
> Regards
> Liang
>
>
> Jihong Ma wrote
>> +1, To avoid potential compatibility issue, we could introduce this param
>> as an optional field, as long as it is not a required field, we are fine
>> with a defined default block size.
>>
>> Regards.
>>
>> Jihong
>>
>> -----Original Message-----
>> From: Jacky Li [mailto:
>> jacky.likun@
>> ]
>> Sent: Monday, September 26, 2016 7:29 AM
>> To:
>> dev@.apache
>> Subject: Re: [Discuss]Set block_size for table on table level
>>
>> I am OK with this feature, the only thing I am worrying about is the
>> compatibility of CarbonData file reader. Can you make it compatible when
>> you reading old CarbonData file without this property.
>> We have encountered many times that user need to delete the store and
>> re-load the data.
>>
>> Regards,
>> Jacky
>>
>>> 在 2016年9月26日,下午2:15,Ravindra Pesala &lt;
>> ravi.pesala@
>> &gt; 写道:
>>> +1
>>> At same time max and min block size should be restricted and validated
>>> while creating table.
>>>
>>> On 26 September 2016 at 07:36, Zhangshunyu &lt;
>> zhangshunyu1990@
>> &gt; wrote:
>>>> Purpose:
>>>> To configure block file size for each table on column level, so that
>>>> each
>>>> table could has its own blocksize.
>>>> My solution:
>>>> Add a new parameter in table properties, when create a table, the user
>>>> can
>>>> set it in ddl. Add a parameter in thrift format just like other
>>>> properties,
>>>> and write this info into thrift file so that this info would not lost
>>>> when
>>>> cluster is restarted.
>>>>
>>>> What's your opinion?
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context: http://apache-carbondata-
>>>> mailing-list-archive.1130556.n5.nabble.com/Discuss-Set-
>>>> block-size-for-table-on-table-level-tp1472.html
>>>> Sent from the Apache CarbonData Mailing List archive mailing list
>>>> archive
>>>> at Nabble.com.
>>>>
>>>
>>> -- 
>>> Thanks & Regards,
>>> Ravi
>
>
>
>
> --
>
>
>   


---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is
not the intended recipient, unauthorized use, forwarding, printing,  storing, disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this communication in error,please
immediately notify the sender by return e-mail, and delete the original message and all copies from
your system. Thank you.
---------------------------------------------------------------------------------------------------


Re: [Discuss]Set block_size for table on table level

Posted by 金铸 <ji...@neusoft.com>.
+1,agree with jihong.


在 2016/9/27 5:12, chenliang613 写道:
> +1, agree with Jihong's comment : make it as optional, usually the default
> block size will be used if user don't specially define it.
>
> Regards
> Liang
>
>
> Jihong Ma wrote
>> +1, To avoid potential compatibility issue, we could introduce this param
>> as an optional field, as long as it is not a required field, we are fine
>> with a defined default block size.
>>
>> Regards.
>>
>> Jihong
>>
>> -----Original Message-----
>> From: Jacky Li [mailto:
>> jacky.likun@
>> ]
>> Sent: Monday, September 26, 2016 7:29 AM
>> To:
>> dev@.apache
>> Subject: Re: [Discuss]Set block_size for table on table level
>>
>> I am OK with this feature, the only thing I am worrying about is the
>> compatibility of CarbonData file reader. Can you make it compatible when
>> you reading old CarbonData file without this property.
>> We have encountered many times that user need to delete the store and
>> re-load the data.
>>
>> Regards,
>> Jacky
>>
>>> 在 2016年9月26日,下午2:15,Ravindra Pesala &lt;
>> ravi.pesala@
>> &gt; 写道:
>>> +1
>>> At same time max and min block size should be restricted and validated
>>> while creating table.
>>>
>>> On 26 September 2016 at 07:36, Zhangshunyu &lt;
>> zhangshunyu1990@
>> &gt; wrote:
>>>> Purpose:
>>>> To configure block file size for each table on column level, so that
>>>> each
>>>> table could has its own blocksize.
>>>> My solution:
>>>> Add a new parameter in table properties, when create a table, the user
>>>> can
>>>> set it in ddl. Add a parameter in thrift format just like other
>>>> properties,
>>>> and write this info into thrift file so that this info would not lost
>>>> when
>>>> cluster is restarted.
>>>>
>>>> What's your opinion?
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context: http://apache-carbondata-
>>>> mailing-list-archive.1130556.n5.nabble.com/Discuss-Set-
>>>> block-size-for-table-on-table-level-tp1472.html
>>>> Sent from the Apache CarbonData Mailing List archive mailing list
>>>> archive
>>>> at Nabble.com.
>>>>
>>>
>>> -- 
>>> Thanks & Regards,
>>> Ravi
>
>
>
>
> --
>
>
>   


---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is
not the intended recipient, unauthorized use, forwarding, printing,  storing, disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this communication in error,please
immediately notify the sender by return e-mail, and delete the original message and all copies from
your system. Thank you.
---------------------------------------------------------------------------------------------------

RE: [Discuss]Set block_size for table on table level

Posted by chenliang613 <ch...@gmail.com>.
+1, agree with Jihong's comment : make it as optional, usually the default
block size will be used if user don't specially define it.

Regards
Liang


Jihong Ma wrote
> +1, To avoid potential compatibility issue, we could introduce this param
> as an optional field, as long as it is not a required field, we are fine
> with a defined default block size. 
> 
> Regards.
> 
> Jihong
> 
> -----Original Message-----
> From: Jacky Li [mailto:

> jacky.likun@

> ] 
> Sent: Monday, September 26, 2016 7:29 AM
> To: 

> dev@.apache

> Subject: Re: [Discuss]Set block_size for table on table level
> 
> I am OK with this feature, the only thing I am worrying about is the
> compatibility of CarbonData file reader. Can you make it compatible when
> you reading old CarbonData file without this property.
> We have encountered many times that user need to delete the store and
> re-load the data.
> 
> Regards,
> Jacky
> 
>> 在 2016年9月26日,下午2:15,Ravindra Pesala &lt;

> ravi.pesala@

> &gt; 写道:
>> 
>> +1
>> At same time max and min block size should be restricted and validated
>> while creating table.
>> 
>> On 26 September 2016 at 07:36, Zhangshunyu &lt;

> zhangshunyu1990@

> &gt; wrote:
>> 
>>> Purpose:
>>> To configure block file size for each table on column level, so that
>>> each
>>> table could has its own blocksize.
>>> My solution:
>>> Add a new parameter in table properties, when create a table, the user
>>> can
>>> set it in ddl. Add a parameter in thrift format just like other
>>> properties,
>>> and write this info into thrift file so that this info would not lost
>>> when
>>> cluster is restarted.
>>> 
>>> What's your opinion?
>>> 
>>> 
>>> 
>>> --
>>> View this message in context: http://apache-carbondata-
>>> mailing-list-archive.1130556.n5.nabble.com/Discuss-Set-
>>> block-size-for-table-on-table-level-tp1472.html
>>> Sent from the Apache CarbonData Mailing List archive mailing list
>>> archive
>>> at Nabble.com.
>>> 
>> 
>> 
>> -- 
>> Thanks & Regards,
>> Ravi





--
View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discuss-Set-block-size-for-table-on-table-level-tp1472p1498.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com.

RE: [Discuss]Set block_size for table on table level

Posted by Jihong Ma <Ji...@huawei.com>.
+1, To avoid potential compatibility issue, we could introduce this param as an optional field, as long as it is not a required field, we are fine with a defined default block size. 

Regards.

Jihong

-----Original Message-----
From: Jacky Li [mailto:jacky.likun@qq.com] 
Sent: Monday, September 26, 2016 7:29 AM
To: dev@carbondata.incubator.apache.org
Subject: Re: [Discuss]Set block_size for table on table level

I am OK with this feature, the only thing I am worrying about is the compatibility of CarbonData file reader. Can you make it compatible when you reading old CarbonData file without this property.
We have encountered many times that user need to delete the store and re-load the data.

Regards,
Jacky

> 在 2016年9月26日,下午2:15,Ravindra Pesala <ra...@gmail.com> 写道:
> 
> +1
> At same time max and min block size should be restricted and validated
> while creating table.
> 
> On 26 September 2016 at 07:36, Zhangshunyu <zh...@126.com> wrote:
> 
>> Purpose:
>> To configure block file size for each table on column level, so that each
>> table could has its own blocksize.
>> My solution:
>> Add a new parameter in table properties, when create a table, the user can
>> set it in ddl. Add a parameter in thrift format just like other properties,
>> and write this info into thrift file so that this info would not lost when
>> cluster is restarted.
>> 
>> What's your opinion?
>> 
>> 
>> 
>> --
>> View this message in context: http://apache-carbondata-
>> mailing-list-archive.1130556.n5.nabble.com/Discuss-Set-
>> block-size-for-table-on-table-level-tp1472.html
>> Sent from the Apache CarbonData Mailing List archive mailing list archive
>> at Nabble.com.
>> 
> 
> 
> -- 
> Thanks & Regards,
> Ravi




Re: [Discuss]Set block_size for table on table level

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
+1

Regards
JB

On 09/26/2016 04:29 PM, Jacky Li wrote:
> I am OK with this feature, the only thing I am worrying about is the compatibility of CarbonData file reader. Can you make it compatible when you reading old CarbonData file without this property.
> We have encountered many times that user need to delete the store and re-load the data.
>
> Regards,
> Jacky
>
>>  2016926\u05632:15Ravindra Pesala <ra...@gmail.com> \u0434
>>
>> +1
>> At same time max and min block size should be restricted and validated
>> while creating table.
>>
>> On 26 September 2016 at 07:36, Zhangshunyu <zh...@126.com> wrote:
>>
>>> Purpose:
>>> To configure block file size for each table on column level, so that each
>>> table could has its own blocksize.
>>> My solution:
>>> Add a new parameter in table properties, when create a table, the user can
>>> set it in ddl. Add a parameter in thrift format just like other properties,
>>> and write this info into thrift file so that this info would not lost when
>>> cluster is restarted.
>>>
>>> What's your opinion?
>>>
>>>
>>>
>>> --
>>> View this message in context: http://apache-carbondata-
>>> mailing-list-archive.1130556.n5.nabble.com/Discuss-Set-
>>> block-size-for-table-on-table-level-tp1472.html
>>> Sent from the Apache CarbonData Mailing List archive mailing list archive
>>> at Nabble.com.
>>>
>>
>>
>> --
>> Thanks & Regards,
>> Ravi
>
>
>

-- 
Jean-Baptiste Onofr
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: [Discuss]Set block_size for table on table level

Posted by Jacky Li <ja...@qq.com>.
I am OK with this feature, the only thing I am worrying about is the compatibility of CarbonData file reader. Can you make it compatible when you reading old CarbonData file without this property.
We have encountered many times that user need to delete the store and re-load the data.

Regards,
Jacky

> 在 2016年9月26日,下午2:15,Ravindra Pesala <ra...@gmail.com> 写道:
> 
> +1
> At same time max and min block size should be restricted and validated
> while creating table.
> 
> On 26 September 2016 at 07:36, Zhangshunyu <zh...@126.com> wrote:
> 
>> Purpose:
>> To configure block file size for each table on column level, so that each
>> table could has its own blocksize.
>> My solution:
>> Add a new parameter in table properties, when create a table, the user can
>> set it in ddl. Add a parameter in thrift format just like other properties,
>> and write this info into thrift file so that this info would not lost when
>> cluster is restarted.
>> 
>> What's your opinion?
>> 
>> 
>> 
>> --
>> View this message in context: http://apache-carbondata-
>> mailing-list-archive.1130556.n5.nabble.com/Discuss-Set-
>> block-size-for-table-on-table-level-tp1472.html
>> Sent from the Apache CarbonData Mailing List archive mailing list archive
>> at Nabble.com.
>> 
> 
> 
> -- 
> Thanks & Regards,
> Ravi




Re: [Discuss]Set block_size for table on table level

Posted by Ravindra Pesala <ra...@gmail.com>.
+1
At same time max and min block size should be restricted and validated
while creating table.

On 26 September 2016 at 07:36, Zhangshunyu <zh...@126.com> wrote:

> Purpose:
> To configure block file size for each table on column level, so that each
> table could has its own blocksize.
> My solution:
> Add a new parameter in table properties, when create a table, the user can
> set it in ddl. Add a parameter in thrift format just like other properties,
> and write this info into thrift file so that this info would not lost when
> cluster is restarted.
>
> What's your opinion?
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/Discuss-Set-
> block-size-for-table-on-table-level-tp1472.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>



-- 
Thanks & Regards,
Ravi

Re: [Discuss]Set block_size for table on table level

Posted by Zhangshunyu <zh...@126.com>.
For each table, we can set block size consider the data.size, this is because
that when execute query, each task will get one block to process one time,
when the blocks num <  parallelism, set a reasonable block size would get
most suitable block num, to make the best of parallelism.




--
View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discuss-Set-block-size-for-table-on-table-level-tp1472p1538.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com.