You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by Ravindra Pesala <ra...@gmail.com> on 2019/09/10 14:41:22 UTC

[DISCUSSION] Support heterogeneous format segments in carbondata

Hi All,

 This discussion is regarding support of other formats in carbon. Already
existing customers use other formats like parquet, orc etc., but if they
want to migrate to carbon there is no proper solution at hand. So this
feature allows all the old data to add as a segment to carbondata .  And
during query, it reads old data in its respective format and all new
segments will be read in carbon.

I have created the design document and attached to the jira. Please review
it.
https://issues.apache.org/jira/browse/CARBONDATA-3516


-- 
Thanks & Regards,
Ravindra

Re: [DISCUSSION] Support heterogeneous format segments in carbondata

Posted by xuchuanyin <xu...@apache.org>.
Hi, ravipesala, previously I have a similar proposal, please check if this
can make any help:
https://gist.github.com/xuchuanyin/cb264f2d7e94d6e185a55ea962e91ce1

Besides, for the problem in your proposal, the user can create a
`table_with_old_format_data` and create another `table_with_new_format_data`
and then create a `joint_table` union both tables. All the queries are fired
on the `joint_table`. ---- problem SOLVED...



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSSION] Support heterogeneous format segments in carbondata

Posted by Jacky Li <ja...@apache.org>.
IMHO

On 2019/09/11 06:46:21, chetan bhat <ch...@gmail.com> wrote: 
> Hi Ravi,
> 
> 1. What are the data formats that shall be supported to add segment. ?
I think for the first phase we can target the tables that user may want to migrate to carbon, like orc and parquet tables. In future, we can consider CSV also.

> 2. Will the alter table be supported after loading multiple segments each having different data format.
Since this feature is only target for migrating the legacy table, I think we should keep it simple. So, no.

> 3. If user wants to execute select query from certain segments only using set segments feature will he/she able to do so now after this feature implementation?
Yes, I think it should be supported 

> 4. Will the index files be created for the segments created from external formats. If yes will the merge index feature be supported.?
Same as query 1, no.

> 
> Regards
> Chetan
> 
> On 2019/09/10 14:41:22, Ravindra Pesala <ra...@gmail.com> wrote: 
> > Hi All,
> > 
> >  This discussion is regarding support of other formats in carbon. Already
> > existing customers use other formats like parquet, orc etc., but if they
> > want to migrate to carbon there is no proper solution at hand. So this
> > feature allows all the old data to add as a segment to carbondata .  And
> > during query, it reads old data in its respective format and all new
> > segments will be read in carbon.
> > 
> > I have created the design document and attached to the jira. Please review
> > it.
> > https://issues.apache.org/jira/browse/CARBONDATA-3516
> > 
> > 
> > -- 
> > Thanks & Regards,
> > Ravindra
> > 
> 

Re: [DISCUSSION] Support heterogeneous format segments in carbondata

Posted by chetan bhat <ch...@gmail.com>.
Hi Ravi,

1. What are the data formats that shall be supported to add segment. ?
2. Will the alter table be supported after loading multiple segments each having different data format.
3. If user wants to execute select query from certain segments only using set segments feature will he/she able to do so now after this feature implementation?
4. Will the index files be created for the segments created from external formats. If yes will the merge index feature be supported.?

Regards
Chetan

On 2019/09/10 14:41:22, Ravindra Pesala <ra...@gmail.com> wrote: 
> Hi All,
> 
>  This discussion is regarding support of other formats in carbon. Already
> existing customers use other formats like parquet, orc etc., but if they
> want to migrate to carbon there is no proper solution at hand. So this
> feature allows all the old data to add as a segment to carbondata .  And
> during query, it reads old data in its respective format and all new
> segments will be read in carbon.
> 
> I have created the design document and attached to the jira. Please review
> it.
> https://issues.apache.org/jira/browse/CARBONDATA-3516
> 
> 
> -- 
> Thanks & Regards,
> Ravindra
> 

Re: [DISCUSSION] Support heterogeneous format segments in carbondata

Posted by Kumar Vishal <ku...@gmail.com>.
+1
Regards
Kumar Vishal

On Mon, Oct 7, 2019 at 10:24 AM Kunal Kapoor <ku...@gmail.com>
wrote:

> +1
>
> On Mon, Sep 30, 2019, 2:44 PM Akash Nilugal <ak...@gmail.com>
> wrote:
>
> > Hi
> >
> > +1
> > One question is , is add segment and load data to main table supported?
> If
> > yes, how the segment locking thing is handled? as we are going to add an
> > entry inside table status with a segment id for added segment.
> >
> > Regards,
> > Akash
> >
> > On 2019/09/10 14:41:22, Ravindra Pesala <ra...@gmail.com> wrote:
> > > Hi All,
> > >
> > >  This discussion is regarding support of other formats in carbon.
> Already
> > > existing customers use other formats like parquet, orc etc., but if
> they
> > > want to migrate to carbon there is no proper solution at hand. So this
> > > feature allows all the old data to add as a segment to carbondata .
> And
> > > during query, it reads old data in its respective format and all new
> > > segments will be read in carbon.
> > >
> > > I have created the design document and attached to the jira. Please
> > review
> > > it.
> > > https://issues.apache.org/jira/browse/CARBONDATA-3516
> > >
> > >
> > > --
> > > Thanks & Regards,
> > > Ravindra
> > >
> >
>

Re: [DISCUSSION] Support heterogeneous format segments in carbondata

Posted by Kunal Kapoor <ku...@gmail.com>.
+1

On Mon, Sep 30, 2019, 2:44 PM Akash Nilugal <ak...@gmail.com> wrote:

> Hi
>
> +1
> One question is , is add segment and load data to main table supported? If
> yes, how the segment locking thing is handled? as we are going to add an
> entry inside table status with a segment id for added segment.
>
> Regards,
> Akash
>
> On 2019/09/10 14:41:22, Ravindra Pesala <ra...@gmail.com> wrote:
> > Hi All,
> >
> >  This discussion is regarding support of other formats in carbon. Already
> > existing customers use other formats like parquet, orc etc., but if they
> > want to migrate to carbon there is no proper solution at hand. So this
> > feature allows all the old data to add as a segment to carbondata .  And
> > during query, it reads old data in its respective format and all new
> > segments will be read in carbon.
> >
> > I have created the design document and attached to the jira. Please
> review
> > it.
> > https://issues.apache.org/jira/browse/CARBONDATA-3516
> >
> >
> > --
> > Thanks & Regards,
> > Ravindra
> >
>

Re: [DISCUSSION] Support heterogeneous format segments in carbondata

Posted by Akash Nilugal <ak...@gmail.com>.
Hi

+1 
One question is , is add segment and load data to main table supported? If yes, how the segment locking thing is handled? as we are going to add an entry inside table status with a segment id for added segment.

Regards,
Akash

On 2019/09/10 14:41:22, Ravindra Pesala <ra...@gmail.com> wrote: 
> Hi All,
> 
>  This discussion is regarding support of other formats in carbon. Already
> existing customers use other formats like parquet, orc etc., but if they
> want to migrate to carbon there is no proper solution at hand. So this
> feature allows all the old data to add as a segment to carbondata .  And
> during query, it reads old data in its respective format and all new
> segments will be read in carbon.
> 
> I have created the design document and attached to the jira. Please review
> it.
> https://issues.apache.org/jira/browse/CARBONDATA-3516
> 
> 
> -- 
> Thanks & Regards,
> Ravindra
>